🛑 Kubernetes Graceful Shutdown: PreStop Hooks, Probes, and Termination Lifecycle Explained

Iresh EkanayakaIresh Ekanayaka
7 min read

When deploying applications in Kubernetes, it's not just about running a container and calling it a day. Kubernetes manages the entire pod lifecycle - from startup to health checks to shutdown - and that includes graceful termination.

Whether it's due to a rolling update, node scaling, OOM crash, or even a manual kubectl delete, Kubernetes gives us the tools to shut down pods gracefully-without losing traffic or corrupting data.

In this post, we’ll break down:

  • What happens during a pod termination

  • How lifecycle hooks like preStop work

  • The role of probes in shutdown

  • How to avoid dropped traffic or forced shutdowns

📌 When Does Kubernetes Terminate Pods?

Pod termination can happen for several reasons:

  • 🚀 Rolling updates

  • ⚖️ Node scaling or maintenance

  • 💥 Crashes or Out-Of-Memory (OOM) errors

  • 🔧 Manual deletion (e.g., kubectl delete pod)

  • ❌ Failing livenessProbe

If your app is still handling requests or doing important work, an instant kill could cause problems. Instead, Kubernetes provides a graceful shutdown mechanism.


🧠 Quick Glossary

ConceptMeaning
preStop hookA lifecycle hook that runs just before the container is stopped
terminationGracePeriodSecondsHow long Kubernetes waits before forcefully killing the pod
readinessProbeMarks pod as "ready" to receive traffic
livenessProbeChecks if pod is healthy/alive

🛑 What Happens When You Delete a Pod?

Let's say you run:

kubectl delete pod my-app

Here’s what Kubernetes does step-by-step:

  1. Marks pod as "Terminating"

  2. Removes pod from Service via readiness probe failure - Not Ready (so no new traffic is sent to the pod).

  3. Runs the preStop hook (if defined)

  4. Sends SIGTERM to the container.

  5. Waits for terminationGracePeriodSeconds

  6. If the container is still running after the grace period, sends SIGKILL to force kill.

  7. Deletes the pod

25 (terminationGracePeriodSeconds) = 5 (preStop) + 15 (app shutdown) + 5 (buffer)

🔄 Kubernetes Pod Termination Flow (Step-by-Step)

🔹 Step 1: Termination is Triggered

Termination is initiated by:

  • kubectl delete pod

  • Rolling update (Deployment, StatefulSet)

  • Node scaling or eviction

  • Failing livenessProbe

📍 The API server receives the delete request and sets deletionTimestamp on the pod.


🔹 Step 2: Pod is Marked as Terminating

The pod is not deleted immediately.

  • It's marked as Terminating

  • The controller (e.g., ReplicaSet) may spin up a replacement pod


🔹 Step 3: Kubelet Detects Termination

  • The Kubelet on the node watches the API server.

  • It sees the pod is terminating and starts graceful shutdown.


🔹 Step 4: preStop Hook Executes

If you’ve defined a preStop hook in your pod spec:

lifecycle:
  preStop:
    exec:
      command: ["/usr/bin/save-state.sh"]

Kubelet executes it using:

  • exec (run a command in the container)

  • httpGet (make HTTP call to internal endpoint)

  • tcpSocket (deprecated)

🕒 The hook must complete before SIGTERM is sent.


🔹 Step 5: SIGTERM Is Sent

After preStop finishes, Kubelet sends a SIGTERM signal to the container.

This gives your app a chance to shut down politely - like:

  • Closing DB connections

  • Draining message queues

  • Finishing current request

⚠️ If your app doesn’t handle SIGTERM, it may be killed before completing shutdown.


🔹 Step 6: Termination Grace Period Countdown

The clock starts ticking based on:

spec:
  terminationGracePeriodSeconds: 30
  • The total time includes preStop + app shutdown

  • Default is 30 seconds


🔹 Step 7: SIGKILL If Timeout Expires

If the container is still running after the grace period, Kubernetes sends:

SIGKILL

At this point, the container is forcefully stopped - even if it's still working.


🔹 Step 8: Pod Is Deleted

Once the container stops:

  • The Kubelet deletes the pod from the node

  • The API server removes the pod from the cluster state

📣 Wrapping Up

Kubernetes gives you the tools to shut down pods cleanly-but it's up to you to use them right.

By defining a preStop hook, setting a realistic terminationGracePeriodSeconds, and properly using probes, you can:

  • Avoid dropped connections

  • Prevent data corruption

  • Ensure smoother rolling updates


🚀 Understanding Kubernetes Pod Lifecycle with Restaurant Analogy 🍽️

From Startup to Shutdown - Explained Visually with Liveness, Readiness, PreStop & Termination Grace

🏁 1. Pod Starts = Restaurant Opening

🍽️ Restaurant Analogy:

  • Staff arrives.

  • Kitchen is being prepped.

  • The "Open" sign is still OFF.

  • Customers are NOT allowed in yet.

🔍 Readiness Probe returns ❌

A restaurant with a "Closed" sign, staff inside cooking/prepping.

A restaurant with a "Closed" sign, staff inside cooking/prepping.

⚙️ Kubernetes Explanation:

  • Pod is created.

  • Containers inside start.

  • Kubernetes starts checking the readinessProbe.

  • If readinessProbe fails → Pod is NOT added to Service LoadBalancer.


✅ 2. Pod is Ready = Open to Customers

🍽️ Restaurant Analogy:

  • Kitchen is ready.

  • Staff says: "We’re good to go!"

  • "Open" sign is ON.

  • Google Maps starts showing your restaurant.

  • Customers (traffic) start coming in.

✅ Readiness Probe passes

A restaurant with customers entering, kitchen in action.

⚙️ Kubernetes Explanation:

  • readinessProbe starts returning success.

  • Pod is added to Service endpoints.

  • Kubernetes sends traffic to the pod.

  • The pod is now Ready.


❤️ 3. Staying Healthy = Passing Health Inspections

🍽️ Restaurant Analogy:

  • Health inspector comes in every 10 minutes.

  • Checks kitchen, staff, environment.

  • If staff fainted, kitchen on fire - 🚫 you fail.

Liveness Probe checks every interval

Health inspector checking kitchen hygiene.

⚙️ Kubernetes Explanation:

  • livenessProbe runs periodically.

  • If the liveness probe fails:

    • Kubernetes kills and restarts the container.

    • Useful when your app hangs but doesn’t crash.


🛑 4. Shutdown Begins = Landlord Gives Notice

🍽️ Restaurant Analogy:

Landlord (Kubernetes) says:
"You're shutting down in 60 seconds."

  • You lock the front door → 🛑 No more new customers.

  • Waiters finish serving ongoing orders.

  • Kitchen finishes cooking.

  • Staff exits gracefully.

This 60s is your terminationGracePeriodSeconds

Restaurant putting up “Closing soon” sign, waiters finishing orders.

⚙️ Kubernetes Explanation:

  • Kubernetes initiates pod shutdown (e.g., due to kubectl delete pod).

  • Kubernetes updates the readiness probe status (Not Ready/Terminating).

  • Waits for terminationGracePeriodSeconds (default: 30s).

  • Meanwhile:

    • Executes preStop hook.

    • Stops sending traffic by failing readiness.

    • Allows app to clean up (e.g., finish jobs, close DB).


🔒 5. PreStop Hook = Locking the Door

🍽️ Restaurant Analogy:

  • You run a command: "Lock the front door."

  • Sign flips to “Closed.”

  • Waiters: “No new customers allowed.”

preStop is a lifecycle hook that runs BEFORE SIGTERM

Staff removing restaurant from food delivery app before closing.

⚙️ Kubernetes Explanation:

  • preStop runs BEFORE SIGTERM.

  • Often used to:

    • Unregister from a service discovery system.

    • Drain ongoing traffic.

    • Notify other systems.


🔚 6. Graceful Exit = Wrap-up

🍽️ Restaurant Analogy:

  • No new customers.

  • Kitchen finishes pending dishes.

  • Staff exits.

  • Everyone goes home. No force needed.

All done within terminationGracePeriodSeconds ✅

Restaurant empty, lights off, sign = "Closed".

⚙️ Kubernetes Explanation:

  • Your app finishes cleanup before the grace period ends.

  • Container exits.

  • Pod gets removed cleanly.

  • No data loss. No corruption.


💀 7. Forced Shutdown = Bouncer Kicks You Out

🍽️ Restaurant Analogy:

  • You took too long to close.

  • Landlord sends security (SIGKILL).

  • Everyone kicked out.

  • Food wasted, customers angry.

❗ SIGKILL is sent when terminationGracePeriodSeconds is exceeded

Angry landlord dragging staff out, customers confused.

⚙️ Kubernetes Explanation:

  • If your app doesn’t terminate within the grace period:

    • Kubernetes sends SIGKILL.

    • Immediate stop.

    • You can't recover anything.

    • This may cause data loss (e.g., half-written files).

0
Subscribe to my newsletter

Read articles from Iresh Ekanayaka directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Iresh Ekanayaka
Iresh Ekanayaka