Debugging disk pressure on a k8s node


A while back, I ran into disk pressure on a node. It was a bit like opening your fridge and finding it packed, but having no idea what was taking up all the room. Our system started complaining about low disk space. I needed to figure out what was hogging all that storage before things got out of hand.
Investigation aka Tracking Down what filled the disk
I started with a simple method to see which directories were using the most space. The tool of choice was du
paired with sort
. Here’s the command I used:
sudo du -ahx --max-depth=1 / | sort -k1 -rh
This breaks down as follows:
sudo
gives me the access needed to scan every directory.du
means disk usage. It checks files and folders to see how much space they take.-a
includes both files and directories.-h
shows sizes in human-readable format, like MB or GB.-x
keeps things limited to the current filesystem.--max-depth=1
only shows the top level in the directory tree, making results easier to digest./
starts the scan from the root.The pipe
|
takes what comes before it and passes it on.sort -k1 -rh
puts the biggest results at the top, sorting by size.
My approach was straightforward: run the command at root, check which folder is biggest, then repeat inside that folder. This narrows down the possibilities fast.
The Main Offender
After just a few rounds, I found the culprit. It was the directory
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
If you use containers a lot, you might have seen this before. Containerd stores snapshots, images, and temporary files in that directory. Over time, especially with lots of deployments or heavy usage, it can quietly eat away at disk space.
What to Do Next
Once the problem is identified:
Check for unused images and containers.
Check if you need to tweak the values of imageGCHighThresholdPercent or imageGCLowThresholdPercent for Garbage collection of unused container images. See more here.
Consider setting up monitoring or alerts if this happens often.
Wrapping Up
Disk pressure feels frustrating, but a methodical approach using simple tools like du
can make troubleshooting much less painful. Running this command, iterating into each large directory, and staying logical pointed me straight to the biggest space hog.
When working with containers, keep an eye on overlayfs snapshots, A hands-on approach to diagnosing and fixing disk pressure using simple Linux commandsthey build up quicker than expected. Regular checks can prevent bigger headaches down the line.
Subscribe to my newsletter
Read articles from Muskan Agrawal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Muskan Agrawal
Muskan Agrawal
Cloud and DevOps professional with a passion for automation, containers, and cloud-native practices, committed to sharing lessons from the trenches while always seeking new challenges. Combining hands-on expertise with an open mind, I write to demystify the complexities of DevOps and grow alongside the tech community.