This document explains how Kubernetes monitors the health of containerized applications using probes and container states. The reader will learn about Probes and how they help on keeping a service available.

Container's Health

After allocating resources to a container, the application execution will initiate. There's the boot stage, where the application configures itself, checks if a database is online, downloads and initializes some libraries and as soon as the the boot process is completed, the application is ready to receive requests and process them.

To determine the state of a container, one can configure Container Probes to be used by kubelet. Kubelet will execute the configured check mechanism and the result of the check will be assigned to the container a state. And the state assigned will determine the proper action to perform on the container. For example, for a container with a REST application that stops responding to requests, when a Liveness Probe is performed on it and results in failure, kubelet will terminate the container and start a new one.

Container's Probes and Check mechanisms

As K8s only concern is to monitor how the application is behaving in terms of execution, there's the need to track if is booting, if is ready to receive requests or if is not responding at all. So to keep track of those states, K8s has 3 probes that are executed depending on what the application is doing (booting, ready for processing) and periodically:

Liveness Probe
Startup Probe
Readiness Probe

These probes will help K8s to determine what is called a Container State.

A container can have one of 3 states:

Waiting is given when a container is either Running or Terminated;
Running when a container is running without issues;
Terminated when a container started execution and either ran to completion or failed for some reason.

Probe Mechanisms

A probe is a diagnostic performed periodically by the kubelet on a container. The probe is performed by executing code within a container or by making a network request.

There are four check mechanisms: exec, grpc, httpGet and tcpSocket.

exec

Executes a specific command inside a container. The diagnostic is considered successful if the command is executed with success.

apiVersion: apps/v1
kind: Deployment
metadata:
 name: example-exec
 namespace: default
spec:
  containers:
    - name: demo
      image: nginx
    livenessProbe: 
      exec:        # probe mechanism
        command:
        - cat      # command to perform
        - /usr/share/nginx/html/index.html # 1st argument for cat command

Code Block 1 - perform cat command on index.html file for Liveness Probe

grpc

Performs a remote procedure call using gRPC. The target should implement gRPC health checks. The diagnostic is considered successful if the status of the response is SERVING.[2]

livenessProbe: 
  grpc:         # probe mechanism
  port: 8000  # port on where to perform the grpc call

Code Block 2 - defining grpc call on container's port 8000 for Liveness Probe

httpGet

Performs an HTTP GET request against the Pod's IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.

livenessProbe:
  httpGet:         # probe mechanism
    path: /health  # endpoint where to perform the Get
    port: 8080     # port to use for the test
    httpHeaders:   # optional headers
      - name: Custom-Header # header's name
        value: ItsAlive     # header's value

Code Block 3 - defining htttp get call, on container's port 8000 with optional header for Liveness Probe

tcpSocket

Performs a TCP check against the Pod's IP address on a specified port. The diagnostic is considered successful if the port is open. If the remote system (the container) closes the connection immediately after it opens, this counts as healthy.

livenessProbe:
  tcpSocket:     # probe mechanism
    port: 8080   # port to use for the test

Code Block 4 - defining tcpSocket connection on container's port 8000 for Liveness Probe

Container's Probes

Probes are used by kubelet to determine the state of a container. Once a state is determined kubelet will execute the appropriate action, for example, when the application has passed the boot process and is ready to do its task, a Liveness Probe is executed periodically, if this probe at any time fails, kubelet will kill the container and start a new one.

All probes have five parameters that are crucial to configure.

initialDelaySeconds: Time to wait after the container starts. (default: 0)
periodSeconds: Probe execution frequency (default: 10)
timeoutSeconds: Time to wait for the reply (default: 1)
successThreshold: Number of successful probe executions to mark the container healthy (default: 1)
failureThreshold: Number of failed probe executions to mark the container unhealthy (default: 3)

livenessProbe

Indicates whether the container is running.
When an app is running for a long time, it may stop working normally and the only to recover is by restarting it. For kubelet to determine if the app is running properly, it executes periodically a Liveness Probe.

 spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80    
          livenessProbe:
              exec:
                command:
                  - cat
                  - /tmp/healthy
             initialDelaySeconds: 5
             periodSeconds: 5

Code Block 5 - Liveness Probe with exec mechanism and time parameters

In Code Block 5 it can be seen a Liveness Probe that will execute a cat command on file /temp/healthy , it will execute the probe for the first time 5 seconds after the container achieves the state Running

startupProbe

Indicates whether the application within the container is started.
Startup Probe is used to determine if an application has finished its startup phase. This is special useful in legacy applications where the startup is lengthy and is hard to check when this phase is over.

    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80
          startupProbe:
            initialDelaySeconds: 1
            periodSeconds: 2
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 1
            exec:
              command:
                - cat
                - /etc/nginx/nginx.conf

Code Block 6 - Example of a Startup Probe

readinessProbe

Indicates whether the container is ready to respond to requests.
Sometimes an application becomes unresponsive because it has a very heavy process like a load of a big file or a long SQL transaction making it unable to receive traffic. This does not elect a container to be killed and start a new one. So, we configure a Readiness Probe to check if a Pod is able to receive traffic or not and is executed throughout the container’s lifetime;

        readinessProbe:
          initialDelaySeconds: 1
          periodSeconds: 2
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 1
          httpGet:
            host:
            scheme: HTTP
            path: /
            httpHeaders:
            - name: Host
              value: fake-app.com
            port: 80

Code Block 7 - Example of a Readiness Probe with a httpGet mechanism

References

[1] Kubernetes.io, “Pod Lifecycle,” Kubernetes.io, 2019. Available: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/. [Accessed: Dec. 08, 2023]

[2] kubernetes.io, “Health checking gRPC servers on Kubernetes,” Kubernetes, Oct. 01, 2018. Available: https://kubernetes.io/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/. [Accessed: Dec. 09, 2023]

Part 4 - Monitoring Containers