Kubernetes Trust Boundaries and Data Flow

Application Architecture Overview

Consider we have a multi-tier application consisting of a front end, backend and database.
First the Frontend, An Nginx web server that serves static content like HTML, CSS, and JavaScript files. It also acts as a reverse proxy, meaning it forwards client requests to the appropriate backend services. In our Kubernetes cluster, this Nginx server runs in its own pod within the frontend namespace.
Then we have the Backend API, A set of Node.js microservices that handle the business logic of the application, processing requests, and interacting with the database. These microservices run in separate pods within the backend namespace.
Finally the Database, A MySQL database that stores user data and application state. This database runs in a pod within the database namespace.

Threat Modeling

Threat Modeling is a process that helps to find potential threats, understand their impact, and put measures in place to stop them.

Trust Boundaries

The Need for Boundaries in Our Example: To effectively secure our multi-tier application, we need to isolate different parts of the system and enforce specific security measures for each part.
This isolation helps us manage and reduce security risks by ensuring that a breach in one part of the system doesn’t compromise the entire application.
We call these isolated areas "trust boundaries".

Defining Trust Boundaries - Cluster Boundary

The entire Kubernetes setup, including nodes and control-plane components, forms the cluster boundary.
By using separate clusters for different environments – in this case development, staging, and production environments, we ensure that issues in one environment don’t affect others.
Most importantly, this provides top-level network isolation, meaning that network traffic is isolated at the cluster level, preventing any potential cross-environment contamination.

Defining Trust Boundaries - Node Boundary

Each node, whether virtual or physical, acts as a trust boundary in the cluster.
Nodes host multiple pods and system components such as the kubelet and should only access the resources required to perform their tasks.
For example, If an attacker compromises a node running frontend services, they cannot directly access backend services, reducing the risk of spreading the attack.

Defining Trust Boundaries - Namespace Boundary

Within our cluster, we create namespaces to group related resources.
We have namespaces for the frontend, backend, and database. This helps manage resource access and isolation. They serve as the basic unit for authorization in Kubernetes.
Example: By isolating the frontend, backend, and database components into separate namespaces, we can control access specifically for each component, preventing unauthorized access.

Defining Trust Boundaries - Pod Boundary

Each application component runs in its own pod.
For instance, the Nginx server runs in a frontend pod, backend API services run in backend pods, and the MySQL database runs in a database pod.
This ensures that each component is isolated within its own runtime environment and security contexts and network-level isolation can be defined at the pod level.
Example: If a backend pod is compromised, the attacker cannot directly interact with the frontend or database pods without additional security breaches.

Defining Trust Boundaries - Container Boundary

Inside each pod, containers run the actual services.
For example, an Nginx container within the frontend pod, Node.js containers within the backend pods, and a MySQL container within the database pod.
Containers are the smallest unit of our trust boundaries, providing application-level isolation.
Example: This means that even if one container within a pod is compromised, the damage is limited to that container, providing another layer of isolation.

Exploring Data Flow in a Multi-Tier Application

Now that we've defined our trust boundaries, it's important to understand how data flows through our application.
Data flow is crucial because it helps us understand how information moves through our system. By mapping out this flow, we can identify potential security risks at each stage and apply appropriate controls.
Let’s look at our multi-tier application to see how data flows through it - Here we have an auth service, invoice service, order service , inventory service and log service.
First, users access the application via the Nginx web server in the frontend. This entry point must be secure with measures like HTTPS and authentication to protect user data right from the start.
Next, the Nginx server forwards the user requests to the backend API services. It’s essential to have secure communication channels and proper API authentication here to ensure that only legitimate requests are processed.
The backend services then need to access the database to fetch or store user data. This interaction with the MySQL database must be tightly controlled and encrypted to prevent unauthorized access and protect sensitive information.
Finally, backend services often need to communicate with each other to fulfill the user’s request. Network Policies in Kubernetes can help restrict which pods can communicate with each other, enhancing security by preventing unauthorized inter-pod communication.

Identifying and Mitigating Threats

Having defined the trust boundaries and analyzed the data flow in our Kubernetes environment, the next step is to understand the various types of threats our application might face and how to mitigate them.
Threat actors are entities that pose threats to our system, such as external attackers, compromised containers, or malicious users.
By identifying these threat actors, we can implement security measures to protect our application.

Summary

Use trust boundaries to isolate and secure application components
Analyze data flow to identify potential security risks
Implement Network Policies and RBAC for access control
Apply encryption and authentication to protect data in transit
Conduct threat modeling to identify and mitigate security threats

Persistence

Persistence refers to the ability of an attacker to maintain access to a compromised system even after reboots, updates, or other interruptions.
In a Kubernetes cluster, this can be achieved through various methods, such as exploiting misconfigurations, reading secrets, or leveraging container vulnerabilities.

Attach Vectors for Persistence

The aim of this tree is to discover the several ways an attacker can attempt to gain persistence in the cluster with differing periods of longevity. There are two major branches of the tree.
The first branch focuses on the more obvious approach of reading secrets from within the cluster in order to exploit other vulnerable areas of the cluster providing a persistent foothold.
The second branch focuses on threats where an attacker has gained container access and leverages misconfigurations to establish persistence resilient to pod restarts, node restarts and container restarts respectively.

Mitigating Persistence Risks

To protect our Kubernetes cluster from persistence threats, we need to implement several security controls.

Role Based Access Controls

First Restricting RBAC Permissions.
In our application, we review and tighten the RBAC policies for service accounts used by our Node.js backend pods. This ensures they cannot read secrets or perform administrative actions, reducing the risk of attackers exploiting excessive permissions.

Restrict Access to Secrets

Next, Securing Secrets Management.
We store database credentials and API keys in Kubernetes secrets and restrict access to these secrets to only the pods that require them, such as the MySQL database pod. This prevents unauthorized access to sensitive information.

Hardening Pod Security

Another important control is Hardening Pod Security.
We configure Pod Security Policies to prevent the use of privileged containers and enforce read-only root filesystems for our application pods.
This means that even if an attacker gains access to a pod, they are limited in what they can do, reducing the impact of the attack.

Regular Updates and Patching

It is also important to have Regular Updates and Patching.
We set up a regular update schedule for all container images used by our application, including Nginx, Node.js, and MySQL, and monitor for any new vulnerabilities.
Keeping systems up to date with the latest security patches helps prevent attackers from exploiting known vulnerabilities.

Monitoring and Auditing

Finally, Monitoring and Auditing. We implement logging and monitoring to detect suspicious activities and audit Kubernetes events regularly.
We set up monitoring for our Kubernetes cluster to track access to secrets, changes in RBAC policies, and the creation of new pods. Alerting on any suspicious activities helps us quickly respond to potential threats and mitigate the risk of persistence.

Summary

Persistence allows attackers to maintain access in clusters
Attackers read secrets and leverage misconfigurations for persistence
Restrict RBAC permissions to prevent unauthorized access
Secure secrets management to protect sensitive information
Harden pod security to limit attack impact
Regular update and patch container images
Monitor and audit Kubernetes events for suspicious activities

Denial of Service

Denial of Service (DoS) attack is where attackers overwhelm the system with illegitimate requests or exhaust its resources, making it unavailable to legitimate users.

Attack Vectors for DoS

The CNCF has outlined two main approaches for DoS attacks in Kubernetes
The first approach is from a container compromise scenario where the attacker could attempt to DOS the cluster from within.
The second approach focuses on an attacker with network access to the cluster control plane.
Many of the attacks against control plane endpoints can be mitigated via firewalls, however the attack tree has been developed to highlight all requirements for mitigation.

Mitigating DoS Attacks

To protect our Kubernetes cluster from DoS attacks, we need to implement several security controls.

Resource Quotas and Limits

First, we need to set resource quotas and limits for each namespace to ensure no single container or group of containers can consume excessive resources.
In our application, we set quotas on the number of pods, CPU, and memory usage for the frontend (Nginx), backend (Node.js microservices), and database (MySQL) namespaces.
This ensures that even if an attacker compromises one part of the application, they cannot exhaust the cluster’s resources.

Secure Service Accounts

Next, we must secure service accounts by restricting their permissions to the minimum necessary.
In our application, we ensure that service accounts used by the Node.js backend pods cannot create new pods or perform administrative actions.
This limits the attacker’s ability to use compromised service accounts to launch additional containers.

Network Policies

We also need to use Network Policies to control traffic flow within the cluster and configure firewalls to protect the control plane endpoints.
For instance, we can limit access to the API server to only trusted IP addresses and internal components.
By doing this, we prevent unauthorized network access that could be used to flood the API server with requests.

Monitoring and Alerts

Finally, regular monitoring and alerts are crucial. We implement monitoring to detect unusual activities and set up alerts for potential DoS attacks.
Tools like Prometheus and Grafana help us monitor the cluster's resource usage and alert us to any spikes that may indicate a DoS attempt.
For example, this This alert fires if the CPU usage across containers in a namespace exceeds 80% for 5 minutes.

Summary

DoS attacks overwhelm system resources, causing unresponsiveness
Set resource quotas and limits to prevent excessive resource usage
Restrict service accounts permissions to limit potential attacks
Use network policies and firewalls to control access
Monitor and alert on unusual activity for quick response

Malicious Code Execution

Imagine An attacker finds a vulnerability in our application and uses it to gain access to one of the containers. The attacker’s goal is to execute malicious code within our Kubernetes cluster, potentially leading to further exploitation and control over the environment.

Attack Vectors for Malicious Code

The initial foothold for these threats are primarily through a compromised application providing access to the container, this is perhaps the most likely initial foothold into the Kubernetes environment and one to carefully mitigate.
Note that the vulnerable container may be accessed from either a NodePort on a node or more likely the external network directly.
To the right of the application vulnerability branch are a compilation of threats that focus on attempting to abuse the API server to allow an attacker to exec into a running container.
This part of the attack tree is heavily re-used throughout the Attack Tree models as it is often a key point of entry for a range of attacker goals and scenarios.
Once an attacker gains access to a container the next step in this attack tree is to move towards loading additional malicious code into the environment. There are multiple attack vectors here depending upon the privilege level of the container as noted below.
Alternatively if the Image pull secret can be obtained there is a potential that the attacker could poison the repository to distribute the malicious code from there.

Compromised Application

First Attack vector is Compromised Application.
Consider an attacker who exploits a vulnerability in our application’s frontend server. By gaining access to the container running this server, the attacker can then attempt to load and execute additional malicious code.
For example, they might download and run scripts that further compromise backend services or the database.
If they obtain sufficient privileges, they might even use this initial access to exploit the Kubernetes API server, gaining control over more containers.

Abusing the API Server

Once inside our Kubernetes cluster, an attacker might target the API server to execute commands within running containers.
With access to the API server, they can perform actions such as executing code directly in containers.
For example, the attacker could use the API server to execute commands in a backend container, installing malware that compromises data or steals sensitive information.

Poising the Image Repository

If the attacker manages to get the Image pull secret, they could poison the container image repository by uploading malicious images.
Other parts of the cluster might then pull and run these compromised images, spreading malicious code throughout the environment.
For instance, if a backend service pulls a new image containing a backdoor, the attacker can exploit this backdoor whenever the container is redeployed.

Mitigating Risks

To protect our Kubernetes cluster from malicious code execution, we need to implement several security controls for our application.

Scan Vulnerabilities & Apply Patches

Ensuring our applications are secure by regularly scanning for vulnerabilities and applying patches could be a first step.
This includes updating our web server and backend services with the latest security patches to reduce the risk of exploitation.

Role Based Access Control

Next, we should restrict access to the API server.
By ensuring that only authorized users and services can interact with the API server, we can prevent unauthorized command execution.
Implementing strong authentication and authorization mechanisms, such as Role-Based Access Control (RBAC), helps limit permissions to only what is necessary.

Protect Image Repository

After that, we need to protect our container image repositories.
Securing the Image pull secrets and ensuring that only trusted sources can upload images is essential.
We should store Image pull secrets securely and restrict their access to only necessary pods.
Using signed images helps verify the integrity and authenticity of the images we pull from the repository.

Monitoring and Logging

Additionally, setting up robust monitoring and logging helps detect and respond to malicious activities quickly.
Monitoring access to the API server, changes in Image pull secrets, and the execution of commands within containers can alert us to suspicious activities.
Tools like Prometheus and Grafana are useful for this purpose.

Auditing and Reviewing

Finally, regularly auditing and reviewing our security configurations and practices ensures they remain effective.
We should periodically check the permissions granted to service accounts, the security of our image repositories, and the access controls for the API server.

Summary

Attackers exploit vulnerabilities in containers to execute malicious code
Restrict API Server access to authorized users and services only
Secure image repositories and use signed images for verification
Monitor and log activities to detect and respond to threats
Regularly update and patch applications to prevent security exploits

Compromised Applications in Containers

Suppose an attacker exploits a vulnerability in our Node.js backend service. Then, They gain access to the container running this service. With access to the container, the attacker can explore further avenues of attack within the Kubernetes cluster.

Compromised Container AttackTree

This attack tree is displayed from the starting point of a compromised container detailing any path of exploitation.
One of the most exploited is the potential attack path focusing on a mounted and insecure service account token as detailed below.
Threats associated with a compromised user token result in a similar attack path.
Using this view it is possible to anticipate many attackers approaches, validate the mitigations required and test SIEM controls to ensure any potential attacks are discovered.

Summary

Attackers exploit vulnerabilities to access and control containers
Avoid mounting service account tokens unless absolute necessary
Limit service account permissions to minimize potential damage
Implement network policies to restrict container communication paths
Use RBAC to enforce strict access controls in Kubernetes
Monitor and log activities to detect and respond to threats

Attacker on the Network

This Network Attack Tree shows us different ways an attacker could cause chaos in a Kubernetes cluster by launching a Denial of Service (DoS) attack on various network components and control plane services.

Mitigating Risks

Configuring Firewalls

We configure firewalls to limit network access to the control plane and nodes.
By setting up firewall rules to restrict access to the API server and allowing only trusted IP addresses to communicate with it, we prevent unauthorized access and reduce the risk of denial-of-service attacks.

Securing Nodes

Keeping the operating systems and Kubernetes components on our nodes up to date with the latest security patches is crucial.
Regularly updating and patching the systems helps mitigate vulnerabilities that attackers might exploit.
Ensuring that the nodes running our Node.js backend service are regularly patched and monitored for vulnerabilities is essential.

Implementing Network Policies

With Network policies in place, We can restrict network communication between the frontend, backend, and database components, limiting the attacker’s ability to move from one compromised node to another.

Strong Authentication and Authorization

Enforcing strong authentication and authorization mechanisms is vital to protect access to the API server and other critical components.
Using strong passwords, multi-factor authentication, and Role-Based Access Control (RBAC) limits access to only necessary operations, ensuring that only authorized users can perform sensitive actions.

Monitoring and Logging

Monitoring and logging are essential for detecting unusual network activities and responding quickly to potential attacks.
Tools like Prometheus and Grafana help monitor network traffic and alert us to any suspicious activities.

Summary

Attackers can target Kubernetes control plane and nodes for breaches
Configure firewalls to limit network access to trusted IP addresses
Keep node operating systems and components updated and patched
Implement network policies to control traffic and prevent lateral movement
Use strong authentication, multi-factor and RBAC for secure access
Monitor and log activities to detect and respond to threats

Access to Sensitive Data

This is all about the different tricks an attacker might use to get their hands on sensitive information in a Kubernetes setup.

Exploiting Misconfigured RBAC Permissions

In our application, let's say we have a Node.js backend pod.
If this pod has excessive permissions due to misconfigured RoleBased Access Control (RBAC), an attacker could exploit these permissions to read secrets stored in the cluster.

Mitigating Risks

RBAC

To protect our Kubernetes cluster from unauthorized access to sensitive data we must ensure RBAC permissions are correctly configured.
In our application, we review and tighten the RBAC policies for service accounts used by our backend Node.js pods.
By ensuring these service accounts only have the permissions they need, we reduce the risk of attackers exploiting excessive permissions to access sensitive data.

Viewing Sensitive Data in Logs

Our application generates logs that might inadvertently contain sensitive information.
For instance, if our logs include database queries or API request details, an attacker who gains access to these logs could extract valuable data.
We must secure our logs to ensure they do not contain sensitive information. This involves configuring our logging practices to avoid logging sensitive data like database credentials or API keys.
We should also restrict access to logs so that only authorized personnel can view them.
In our application, centralized logging solutions that provide fine-grained access controls and monitor log access for suspicious activities are essential.

Eavesdropping on Network Traffic

If the communication between the frontend Nginx server and the backend Node.js services is not encrypted, an attacker with network access could intercept this traffic.
Encrypting network traffic is crucial to prevent eavesdropping.
In our application, we use TLS (Transport Layer Security) to encrypt communication between the frontend Nginx server and the backend Node.js services, as well as between all other components.
This ensures that even if an attacker intercepts the traffic, they cannot read the sensitive information being transmitted.

Summary

Ensure RBAC permissions are correctly configured to avoid excessive access
Secure logs to prevent storing and exposing sensitive information
Encrypt network traffic using TLS to prevent eavesdropping attacks

Privilege Escalation

Using root to perform all the operations is a bad idea.
However, there would be times where we need to run a command using the root user privileges, such as when we try to install a new software.
The preferred way to run commands as a root user or a super user is to make use of sudo.
The sudo command offers another approach to giving users administrative access.
When trusted users used administrative command with sudo, they prompted for their own password.
The default configuration for sudo is defined under /etc/sudoers file.
This file defines the policies applied by sudo command and can be update by using visudo command.
Only users listed in the /etc/sudoers file can make use of sudo command for privilege escalation.
The administrator can give granular level of access by sudo.
The system commands executed using sudo are executed in the user’s own shell and not in root shell. As a result, we can eliminate the need for ever having to login as the root user directly.

04 - Kubernetes Threat Model

Table of contents

Kubernetes Trust Boundaries and Data Flow

Application Architecture Overview

Threat Modeling

Trust Boundaries

Defining Trust Boundaries - Cluster Boundary

Defining Trust Boundaries - Node Boundary

Defining Trust Boundaries - Namespace Boundary

Defining Trust Boundaries - Pod Boundary

Defining Trust Boundaries - Container Boundary

Exploring Data Flow in a Multi-Tier Application

Identifying and Mitigating Threats

Summary

Persistence

Attach Vectors for Persistence

Mitigating Persistence Risks

Role Based Access Controls

Restrict Access to Secrets

Hardening Pod Security

Regular Updates and Patching

Monitoring and Auditing

Summary

Denial of Service

Attack Vectors for DoS

Mitigating DoS Attacks

Resource Quotas and Limits

Secure Service Accounts

Network Policies

Monitoring and Alerts

Summary

Malicious Code Execution

Attack Vectors for Malicious Code

Compromised Application

Abusing the API Server

Poising the Image Repository

Mitigating Risks

Scan Vulnerabilities & Apply Patches

Role Based Access Control

Protect Image Repository

Monitoring and Logging

Auditing and Reviewing

Summary

Compromised Applications in Containers

Compromised Container AttackTree

Summary

Attacker on the Network

Mitigating Risks

Configuring Firewalls

Securing Nodes

Implementing Network Policies

Strong Authentication and Authorization

Monitoring and Logging

Summary

Access to Sensitive Data

Exploiting Misconfigured RBAC Permissions

Mitigating Risks

RBAC

Viewing Sensitive Data in Logs

Eavesdropping on Network Traffic

Summary

Privilege Escalation

Subscribe to my newsletter

Rohit Pagote

Rohit Pagote