Topic 1: Kubernetes resource

KevKev
5 min read

The main purpose of Kubernetes is to orchestrate workloads across a cluster of servers. By default, when we use Kubernetes, they are plan workload, however kubernetes itself does not know how many CPU, memory, or other resources each workload needs, nor does it try to control how many resources each workload can access. The problem lies when more pods (different pods for different workloads or replicas) are push to a node, then node pressure happens.

Common issues caused by request and limit if things are not handled well

  1. Problem: Pods fail to schedule due to requests that are too high

    Cause: Insufficient resources are available to meet requests.

    Fix:

    • Lower requests.
    • Add more nodes.
    • Start first with the request for CPU or memory

  2. Problem: Pods evicted or OOMKilled.

    Cause: High resource consumption leads to insufficient resource availability, causing rescheduling or killing of Pods. Pods eviction usually happens when the pod is unresponsive for a certain period of time or node is experience node pressure. Non working images deployed over time can caused such to happen

    Fix:

    • Set limits to prevent excess resource consumption by individual containers.

    • Set limit ranges to manage resource consumption across namespaces. Add nodes.

    • Ensure all images are up to date and tested

  3. Problem: Nodes overloaded or unresponsive.

    Cause: Node CPU or memory resources are maxed out because containers are using too many resources

    Fix:

    • Set limits to restrict container resource consumption

    • Use DaemonSets to move resource-hungry containers to nodes that have more resources.

  4. Problem: Applications experiencing performance degradation or latency issues.

    Cause: Lack of sufficient resources causes application performance problems

    Fix:

    • Set requests that guarantee adequate resources.

    • Set limits to prevent other applications from depriving their neighbors of sufficient resources.

    • To test this out, best to starve the pod with minimum request for CPU and Memory to understand the minimum requirement, then working by increase incrementally on the requests

  5. Problem: Unexpected resource consumption spikes (that can't be explained by an increase in workload traffic).

    Cause: Application bugs (such as memory leaks) trigger a surge in resource consumption.

    Fix:

    • Set limits to restrict how many resources applications can use.

    • Identify and fix the underlying application issue that causes unanticipated resource consumption.

    • This can best be done through stress test on a pod or a series of working pods.

Best practices for setting Kubernetes requests and limits

  1. Align settings with workload priority levels: Engaging with the software development team to allocate the workload based on priority. For most part, increase replicas is a way. Perform stress test to know what is the optimized level.

  2. Review and update requests and limits: The application needs to evolve over time in response to changes like fluctuations in traffic. For example batch jobs in a Bank, over time the files for processing gets larger, so does the need for CPU and memory. The requests and limits you initially set for an app may not be appropriate in the future. Periodically review actual resource consumption data and update settings as needed. Use AI tools to monitor them

  3. Manage limits at both the container and namespace level: You can use memory and CPU limit ranges to manage resource consumption within a namespace. Using both features is a best practice because it helps establish multiple layers of protection against excess resource consumption.

  4. Pods amount in a node: Always do plan the amount of pods in a node and include pods native from Kubernetes and if you are working in a high secure environment, cybersecurity team usually would want to have those scanning pods.

  5. Use auto-scaling strategically: Autoscaling is a tricky topic. Different cloud providers use different ways to reach this. Some emphasize on Horizontal scaling but only for specific node family and also some requires the same specs across. Do talk to your cloud provider on this and best cost savings. At which point you should scale up, well i usually set them at 65%-70%, because by 85% the node is already reaching critical to unresponsive territory.

Let see a few commands to help

  1. Lets get the node list. If you have many nodes, and which to see a specific one than choose from the list being displayed after the command

     echo $(kubectl get node -o name | cut -d / -f 2)
     export NODE=<your selection>
    
  2. The next command shows the invocation can be used to display an example of what node status looks like, for allocatable CPU and memory. Capacity is the raw measure of the node’s resources; allocatable is the portion of it that Kubernetes considers available to claim for running pods.

    ```yaml kubectl get node "$NODE" -o json \ | jq '.status | {capacity, allocatable} | [ to_entries[] | .value |= {cpu, memory} ] | from_entries'

{ "capacity": { "cpu": "2", "memory": "2929240Ki" }, "allocatable": { "cpu": "2", "memory": "2826840Ki" } }


    This command collect all requests (for example, CPU requests) and sum them.

    ```yaml
    kubectl get pods --all-namespaces -o json --field-selector \
      status.phase!=Terminated,status.phase!=Succeeded,status.phase!=Failed,spec.nodeName="$NODE" \
      | jq '[ .items[].spec.containers[].resources.requests.cpu // "0"
              | if endswith("m")
                then (rtrimstr("m") | tonumber / 1000)
                else (tonumber) end
            ] | add * 1000 | round | "\(.)m"'

    "260m"

This command demonstrate how much CPU is still available on a node, and thus how large of a CPU request a pod could make (in theory), and still be scheduled to run here.

    { kubectl get node "$NODE" -o json; \
        kubectl get pods --all-namespaces -o json --field-selector \
          status.phase!=Terminated,status.phase!=Succeeded,status.phase!=Failed,spec.nodeName="$NODE"; } \
      | jq -s '( .[0].status.allocatable.cpu
                 | if endswith("m")
                   then (rtrimstr("m") | tonumber / 1000)
                   else (tonumber) end
               ) as $allocatable
               | ( [ .[1].items[].spec.containers[].resources.requests.cpu // "0"
                     | if endswith("m")
                       then (rtrimstr("m") | tonumber / 1000)
                       else (tonumber) end
                   ] | add
                 ) as $allocated
               | ($allocatable - $allocated) * 1000 | round
               | "\(.)m is available"'

    "1740m is available"

So lets do a calculation above

CPU 2 is 2000m

  1. 2000m - 260m = 1740m

    Based on the calculation you can calculate the amount of CPU still exists for a node.

0
Subscribe to my newsletter

Read articles from Kev directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Kev
Kev