GCP Metadata Server: Practical Guide & Use Case

The 3am problem

It was Friday evening, and I was on-call.

One of our central software components had a nasty bug. At random times, especially in the middle of the night, it would just stop consuming Pub/Sub messages. There was no fix ready, and the only way to keep things running was to manually restart the Kubernetes deployment.

Restarting it wasn’t difficult. But being paged at 3AM repeatedly? Not fun.

Weekend peace, No pages please

I had two goals:

Keep the service running.
Avoid getting paged all weekend.

So I spun up a tiny GCP VM with a service account, threw a Python script at it in tmux, and forgot about it until Monday.

The script used Google Cloud’s metadata server to fetch credentials, monitored message consumption using Cloud Monitoring (PromQL), and restarted the deployment if messages piled up. It even sent a reassuring Slack message when it did.

The orchestration was simple: a while True loop with a few sleep()s in between.
Every few minutes, the script:

Queried the metric using PromQL
Checked if it crossed a defined threshold
Triggered a restart if needed
Sent a Slack notification
Then backed off for a while before checking again

Ugly? Yes.
Effective? 100%.

What Is the GCP Metadata Server?

The GCP metadata server is a behind-the-scenes service that lives at http://169.254.169.254 or http://metadata.google.internal/ inside every GCE VM and other cloud resources (Cloud Run, GKE etc.). It lets your instance fetch metadata about itself, and more importantly, it gives your instance access tokens for its service account.

This means:

You don’t need to hardcode secrets (like service account keys).
You don’t need to run gcloud auth login.
You can call any GCP API just by hitting the metadata server first.

Google provides libraries for many programming languages that wrap calls to the metadata server, making it easier to interact with. However, you can absolutely access it directly yourself using plain HTTP requests, no libraries required.

Here’s how it works in Python:

def get_oauth_token():
    metadata_url = 'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token'
    headers = {'Metadata-Flavor': 'Google'}
    response = requests.get(metadata_url, headers=headers)
    return response.json()['access_token']

So, from a simple HTTP request, you can retrieve an access token tied to the VM’s assigned service account. That token reflects whatever permissions the service account has.

And here’s the cool part: In GCP, almost everything is exposed as an API.

Once you have that token, you can:

Restart GKE deployments
Query Cloud Monitoring
Trigger Cloud Functions
Or do basically anything your service account is allowed to do

No extra SDKs or login flows, just “curl" and the right headers.

Monitoring with PromQL

Google Cloud Monitoring supports PromQL, so you can write powerful queries against built-in metrics.

In my case, I wanted to monitor how many Pub/Sub messages were building up unacknowledged - meaning the service was likely stuck.

Here’s the metric I used:

promql_query = '''
min_over_time(pubsub_googleapis_com:subscription_num_unacked_messages_by_region{
  monitored_resource="pubsub_subscription",
  subscription_id="my-subscription",
  project_id=~".*prod.*"
}[15m])
'''

This checks the minimum number of unacked messages in the last 15 minutes. If it’s consistently high, that’s a strong signal the service isn’t consuming messages.

Making the API Call with Python

Here’s the Python snippet that queries this metric using the Monitoring API and a metadata-sourced access token:

import requests
import json

project_id = "my-gcp-prod-project"
prometheus_endpoint = f"https://monitoring.googleapis.com/v1/projects/{project_id}/location/global/prometheus/api/v1/query"

payload = { "query": promql_query }

access_token = get_oauth_token()
headers = {
    'Authorization': f'Bearer {access_token}',
    'Content-Type': 'application/json'
}

response = requests.post(prometheus_endpoint, headers=headers, data=json.dumps(payload))

if response.status_code == 200:
    result = response.json()
    print(json.dumps(result, indent=2))
else:
    print(f"Error querying PromQL: {response.status_code} - {response.text}")

This snippet shows how simple it is to go from metadata token → PromQL metric in just a few lines of code.

Restart + Notify

If the metric exceeded the threshold, the script:

Triggered a rollout restart of the Kubernetes deployment using the GKE API.
Sent a Slack message to the on-call channel. “🤖 I automatically restarted the deployment because it's not consuming messages. Sleep well on-call person 🌜😴”

Here’s how to restart a Kubernetes deployment directly using GCP APIs:

from datetime import datetime
import requests

def rollout_restart_deployment(namespace, deployment_name, oauth_token, cluster_endpoint):
    # Patch to update the pod template metadata, triggering a rollout
    patch_data = {
        "spec": {
            "template": {
                "metadata": {
                    "annotations": {
                        "kubectl.kubernetes.io/restartedAt": datetime.utcnow().isoformat()
                    }
                }
            }
        }
    }

    url = f"{cluster_endpoint}/apis/apps/v1/namespaces/{namespace}/deployments/{deployment_name}"
    headers = {
        'Authorization': f'Bearer {oauth_token}',
        'Content-Type': 'application/strategic-merge-patch+json',
    }

    response = requests.patch(url, headers=headers, json=patch_data)

    if response.status_code == 200:
        print(f"✅ Successfully triggered rollout restart for deployment '{deployment_name}' in namespace '{namespace}'")
    else:
        print(f"❌ Failed to restart deployment: {response.status_code} - {response.text}")

This works because GKE exposes the Kubernetes API server at a public or private endpoint. Once you have a valid token and the right permissions (container.deployments.update), you can patch the deployment just like kubectl does behind the scenes.

Why This Matters

This story is a reminder that:

The GCP metadata server is a built-in resource that unlocks a lot of cloud-native power, especially when you're not using (or can’t use) Google's client libraries.
Everything (almost) in GCP is an API
Not every solution needs to be elegant. Sometimes, duct tape beats downtime.

When you’re on call, a quick hack that lets you sleep is sometimes the most valuable code you’ll write.

Takeaways

Use the metadata server to get OAuth tokens dynamically.
You can call any GCP API with curl + a token.
Monitor metrics with PromQL and GCP Monitoring.
Kubernetes rollouts can be triggered via the REST API.
A tmux session, single VM and a while trueloop: ugly but effective

A Practical Guide to GCP’s Metadata Server with a Real Use Case