Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/140403
Type: Thesis
Title: Towards Supporting Developers for Securing Containerized Software
Author: Haque, Mubin Ul
Issue Date: 2024
School/Discipline: School of Computer and Mathematical Sciences
Abstract: Container technologies, such as, Docker and Kubernetes, have revolutionized software delivery and deployment by providing a lightweight and portable solution for packaging applications and their dependencies. These technologies enable developers to encapsulate everything needed to run software, including libraries, binaries, and configuration files, into a standalone unit which is known as container images. Container images can be executed consistently across different environments. The benefits of using container technologies are manifold. Container images enable rapid development and deployment, allowing for faster iterations and quicker time-to-market. In addition, containerization technologies provide consistent environments, ensuring that developers can run applications consistently across various systems (e.g., development system and production system). However, despite these advantages, security concerns surrounding container images persist. To ensure the safe and secure use of container images, security professionals, such as the Cloud Security Alliance (CSA), and the National Institute of Standards and Technology (NIST), highly emphasize adopting automation. Manually performing the different security support (e.g., selection of secured container images, identification of artifacts to secure the container images) does not scale up with large container clusters, causes errors, and requires security knowledge. Importantly, manual security setting of container images takes time and effort, which creates a barrier to rapid software development and deployment using containers. In this regard, this thesis provides automation support to ensure the security of containerized software with the following contributions. 1. We conducted a large-scale empirical study to identify the developers’ perceptions, such as discussion topics among developers, their popularity, difficulty, comparison of experts while answering Docker-related questions, and security discussion in the identified Docker topics in the Stack Overflow posts. This empirical study has provided several important insights, namely (i) developers who are adopting Docker have less expertise than the general Stack Overflow members; (ii) Docker is a very popular technology among the developers; (iii) the rate and proportion of security-related discussion is significantly lower than the general Stack Overflow discussion. These insights serve as a foundation for conducting an in-depth analysis of the security of containerized software. 2. To conduct an in-depth analysis of the security of containerized software, we performed a Systematic Literature Review (SLR) on peer-reviewed academic literature. This SLR enables us to identify several key research gap areas, such as the need for support for the (i) selection of secured container images by analyzing the effect of security testing attributes; (ii) non-intrusive security assessment for secured container image selection; (iii) identification of configuration artifacts to securing container images in container orchestrators. 3. To address the selection of secured container images by analyzing the effect of security testing attributes, we performed an empirical study to identify the usage of secured container images by the developers and how the security testing attributes (e.g., total number of security vulnerabilities and severity of security vulnerabilities) varies in the secured container image selection process to develop the containerized software. We studied 64,579 containerized software from GitHub and identified that nearly 91% containerized software were using vulnerable container images. Our empirical experiment with five vulnerable containerized software showed that the total vulnerability composition can be reduced by 39.5% and high severity vulnerability can be reduced by 72.7% if the severity of the security vulnerabilities are considered in the secured image selection process. 4. To address the non-intrusive security assessment for secured container image selection, we proposed the use of Open Container Initiatives (OCI) properties with Machine Learning (ML) models. Our results showed that Light Gradient Boosting Machine (LGBM) achieves a Mathews Correlation Coefficient (MCC) score of 0.856, whereas Logistic Regression (LR), Naive Bayesian (NB), Support Vector Machines (SVM), Decision Tree (DT), and Extreme Gradient Boosting (XGB) achieve MCC scores of 0.442, 0.492, 0.585, 0.774, and 0.800, respectively. 5. To address the need for identification of configuration artifacts to ensure the security support for container images while deploying them in container orchestrators, we proposed a novel Knowledge-Graph (KG) based approach using keyword-based method and Machine Learning models. We achieved an accuracy of 0.98 for our keyword-based method for estimating the relevancy of security documents while identifying artifacts for securing the container images in container orchestrators. We identified that the combination of character and word level features perform better (LR and XGB achieved the MCC score of 0.89) than the individual feature of word and characters while classifying the concepts from the documentations.
Advisor: Falkner, Nickolas
Szabo, Claudia
Dissertation Note: Thesis (Ph.D.) -- University of Adelaide, School of Computer and Mathematical Sciences, 2024
Keywords: Container image
Docker
Kubernetes
security vulnerabilities
machine learning
natural language processing
knowledge graph
Provenance: This thesis is currently under embargo and not available.
Appears in Collections:Research Theses

Files in This Item:
File Description SizeFormat 
Haque2024_PhD.pdf
  Restricted Access
Library staff access only.4.16 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.