A Practical Guide to GCP’s Metadata Server with a Real Use Case


The 3am problem
It was Friday evening, and I was on-call.
One of our central software components had a nasty bug. At random times, especially in the middle of the night, it would just stop consuming Pub/Sub messages. There was no fix ready, and the only way to keep things running was to manually restart the Kubernetes deployment.
Restarting it wasn’t difficult. But being paged at 3AM repeatedly? Not fun.
Weekend peace, No pages please
I had two goals:
Keep the service running.
Avoid getting paged all weekend.
So I spun up a tiny GCP VM with a service account, threw a Python script at it in tmux
, and forgot about it until Monday.
The script used Google Cloud’s metadata server to fetch credentials, monitored message consumption using Cloud Monitoring (PromQL), and restarted the deployment if messages piled up. It even sent a reassuring Slack message when it did.
The orchestration was simple: a while True
loop with a few sleep()
s in between.
Every few minutes, the script:
Queried the metric using PromQL
Checked if it crossed a defined threshold
Triggered a restart if needed
Sent a Slack notification
Then backed off for a while before checking again
Ugly? Yes.
Effective? 100%.
What Is the GCP Metadata Server?
The GCP metadata server is a behind-the-scenes service that lives at http://169.254.169.254
or http://metadata.google.internal/
inside every GCE VM and other cloud resources (Cloud Run, GKE etc.). It lets your instance fetch metadata about itself, and more importantly, it gives your instance access tokens for its service account.
This means:
You don’t need to hardcode secrets (like service account keys).
You don’t need to run
gcloud auth login
.You can call any GCP API just by hitting the metadata server first.
Google provides libraries for many programming languages that wrap calls to the metadata server, making it easier to interact with. However, you can absolutely access it directly yourself using plain HTTP requests, no libraries required.
Here’s how it works in Python:
def get_oauth_token():
metadata_url = 'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token'
headers = {'Metadata-Flavor': 'Google'}
response = requests.get(metadata_url, headers=headers)
return response.json()['access_token']
So, from a simple HTTP request, you can retrieve an access token tied to the VM’s assigned service account. That token reflects whatever permissions the service account has.
And here’s the cool part: In GCP, almost everything is exposed as an API.
Once you have that token, you can:
Restart GKE deployments
Query Cloud Monitoring
Trigger Cloud Functions
Or do basically anything your service account is allowed to do
No extra SDKs or login flows, just “curl
" and the right headers.
Monitoring with PromQL
Google Cloud Monitoring supports PromQL, so you can write powerful queries against built-in metrics.
In my case, I wanted to monitor how many Pub/Sub messages were building up unacknowledged - meaning the service was likely stuck.
Here’s the metric I used:
promql_query = '''
min_over_time(pubsub_googleapis_com:subscription_num_unacked_messages_by_region{
monitored_resource="pubsub_subscription",
subscription_id="my-subscription",
project_id=~".*prod.*"
}[15m])
'''
This checks the minimum number of unacked messages in the last 15 minutes. If it’s consistently high, that’s a strong signal the service isn’t consuming messages.
Making the API Call with Python
Here’s the Python snippet that queries this metric using the Monitoring API and a metadata-sourced access token:
import requests
import json
project_id = "my-gcp-prod-project"
prometheus_endpoint = f"https://monitoring.googleapis.com/v1/projects/{project_id}/location/global/prometheus/api/v1/query"
payload = { "query": promql_query }
access_token = get_oauth_token()
headers = {
'Authorization': f'Bearer {access_token}',
'Content-Type': 'application/json'
}
response = requests.post(prometheus_endpoint, headers=headers, data=json.dumps(payload))
if response.status_code == 200:
result = response.json()
print(json.dumps(result, indent=2))
else:
print(f"Error querying PromQL: {response.status_code} - {response.text}")
This snippet shows how simple it is to go from metadata token → PromQL metric in just a few lines of code.
Restart + Notify
If the metric exceeded the threshold, the script:
Triggered a rollout restart of the Kubernetes deployment using the GKE API.
Sent a Slack message to the on-call channel. “🤖 I automatically restarted the deployment because it's not consuming messages. Sleep well on-call person 🌜😴”
Here’s how to restart a Kubernetes deployment directly using GCP APIs:
from datetime import datetime
import requests
def rollout_restart_deployment(namespace, deployment_name, oauth_token, cluster_endpoint):
# Patch to update the pod template metadata, triggering a rollout
patch_data = {
"spec": {
"template": {
"metadata": {
"annotations": {
"kubectl.kubernetes.io/restartedAt": datetime.utcnow().isoformat()
}
}
}
}
}
url = f"{cluster_endpoint}/apis/apps/v1/namespaces/{namespace}/deployments/{deployment_name}"
headers = {
'Authorization': f'Bearer {oauth_token}',
'Content-Type': 'application/strategic-merge-patch+json',
}
response = requests.patch(url, headers=headers, json=patch_data)
if response.status_code == 200:
print(f"✅ Successfully triggered rollout restart for deployment '{deployment_name}' in namespace '{namespace}'")
else:
print(f"❌ Failed to restart deployment: {response.status_code} - {response.text}")
This works because GKE exposes the Kubernetes API server at a public or private endpoint. Once you have a valid token and the right permissions (container.deployments.update
), you can patch the deployment just like kubectl
does behind the scenes.
Why This Matters
This story is a reminder that:
The GCP metadata server is a built-in resource that unlocks a lot of cloud-native power, especially when you're not using (or can’t use) Google's client libraries.
Everything (almost) in GCP is an API
Not every solution needs to be elegant. Sometimes, duct tape beats downtime.
When you’re on call, a quick hack that lets you sleep is sometimes the most valuable code you’ll write.
Takeaways
Use the metadata server to get OAuth tokens dynamically.
You can call any GCP API with
curl
+ a token.Monitor metrics with PromQL and GCP Monitoring.
Kubernetes rollouts can be triggered via the REST API.
A
tmux
session, single VM and awhile true
loop: ugly but effective
Subscribe to my newsletter
Read articles from Thibaut Tauveron directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Thibaut Tauveron
Thibaut Tauveron
👋 Hi, I’m a cloud engineer and cybersecurity enthusiast based in Zürich. I’ve worn many hats over the years—developer, DevOps consultant, SRE, cloud architect—and what ties it all together is a passion for building secure, scalable systems that just work. I write about cloud infrastructure, DevSecOps, and anything that helps teams move faster without breaking things. I believe in automation, simplicity, and sharing knowledge—whether through blog posts, open source, or mentoring.