Conducting My First Cloud Symphony


I studied Music in high school. I learned about harmony, rhythm, and how individual instruments come together to create a single piece. I never thought those concepts would appear in my tech career, but as I began learning Google Kubernetes Engine (GKE), I realised they were surprisingly relevant.
This post is a detailed log of my first time deploying a real application on GKE. I'm starting from scratch, explaining the core concepts, the commands I used, and, most importantly, the errors I hit and how I fixed them. If you're new to this, I hope my journey helps clarify yours a bit.
Before We Start: The Core Concepts
Before we even touch the cloud, we need to get a few key terms straight. I had to wrap my head around these before anything else made sense.
1. What is a Container? (And Docker?)
A container is a standard package that holds everything your application needs to run: the code, its libraries, and its settings. The best analogy is a musician's custom-built flight case. It has the instrument, the sheet music, the stand—everything. With this case, the musician can show up to any venue in the world, open it up, and perform perfectly without needing anything from the venue itself.
Docker is the tool you use to build these "flight cases." It’s a platform that lets you package your application into a portable container image.
2. What is Kubernetes? (often called K8s)
If one container is one musician, what happens when your application gets big and you need a whole orchestra of them? This is where Kubernetes comes in. Kubernetes is the conductor. It's an open-source system that manages all of your containers (the orchestra) for you. Its job is to:
Schedule: Decide which musician plays on which part of the stage.
Scale: If a song needs more violins, it brings on more violinists (adds more containers).
Self-heal: If a trumpeter suddenly gets sick (a container crashes), it immediately brings in a replacement so the show goes on.
3. What is Google Kubernetes Engine (GKE)?
While Kubernetes is the conductor, it still needs a place to perform. GKE is a world-class concert hall managed by Google. You could build your concert hall (run Kubernetes on your servers), but that's a massive amount of work. GKE provides a ready-to-go, reliable environment. Google's team handles the difficult parts—like the building's electricity, security, and maintenance—so you can focus purely on being the conductor and making sure your application (the music) is great.
With that, let's get started.
Step 1: Setting Up the Concert Hall (The GKE Cluster)
The first step was to create a cluster. A cluster is the foundation of GKE—it's a set of computer resources (called "nodes") that will run our containers. This is the digital venue where our performance will take place.
I chose to create my cluster in Autopilot mode. In GKE, you have two choices:
Standard Mode: You manually choose the size and number of machines (nodes) for your cluster. You have more control, but you're also responsible for managing them.
Autopilot Mode: You just tell GKE to run your applications, and it figures out how much computing power is needed. It automatically adds or removes nodes as required.
For a beginner, Autopilot is perfect. You don't have to worry about managing the underlying servers, and you only pay for the resources your apps consume. After a few minutes of setup in the Google Cloud Console, my cluster was live.
Step 2: Getting the First Musician on Stage (Deploying an App)
Now that the venue was ready, it was time to deploy my first application. All interactions with a Kubernetes cluster happen through a command-line tool called kubectl. Think of it as the conductor's baton; it’s the tool we use to give instructions to our orchestra.
I started with a command to create a Deployment.
kubectl create deployment hello-server --image=us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
A Deployment is a Kubernetes object that acts as a manager for your app. You tell it, "I want one copy of my 'hello-server' app running at all times." The Deployment's job is to make that happen. If your app crashes, the Deployment will automatically start a new one. It’s like hiring a dedicated manager for your string section, whose only job is to ensure the right number of musicians are always on stage and ready to play.
I hit enter and immediately got an error.
error: failed to create deployment... connect: connection refused
This was my first real-world debugging moment. The error meant my kubectl tool wasn't authenticated to talk to my new GKE cluster. The fix was to run a gcloud command to fetch the credentials and configure kubectl for me. This command essentially connected my conductor's podium to the concert hall's sound system, so my instructions could finally be heard.
gcloud container clusters get-credentials musical-cluster --region us-central1
I ran the deployment command again, and this time it worked. My app was running.
However, it was only running inside the cluster. No one from the outside world could access it. To make it publicly available, I had to create a Service.
kubectl expose deployment hello-server --type=LoadBalancer --port 80 --target-port 8080
A Service provides a stable network endpoint (like an IP address) for your application. By setting type=LoadBalancer, I told GKE to provision a real, external cloud load balancer from Google's network, assign it a public IP, and automatically send any traffic to my running app. In our analogy, this is like setting up the main entrance and hiring ushers. The ushers (the Load Balancer) guide the audience to the musician who is ready to play.
After a minute, the external IP was ready. I pasted it into my browser, and there it was.
Step 3: Adding More Musicians (Scaling the Application)
My single app was running fine, but a real application needs to handle lots of users. This means I needed to scale out by adding more copies, or replicas.
This is where the power of the Deployment object shows. With a single wave of the conductor's baton, I told the orchestra manager to expand the section.
kubectl scale deployment hello-server --replicas=3
This command updated the Deployment object, changing its desired state from 1 replica to 3. Kubernetes immediately saw the difference and started two new, identical copies of my app to match the new state.
The Service I created in the previous step automatically detected these new copies and started distributing traffic between all three of them. I could prove this by going back to the browser and refreshing the page. The Hostname value at the bottom changed every few refreshes, showing my request was being handled by a different replica each time. It was a simple but powerful demonstration of scaling and load balancing.
Step 4: The Update That Almost Failed
The final test was to update the application to a new version, 2.0.0, without any downtime. Kubernetes is designed to do this with a rolling update strategy. It replaces old app containers with new ones gradually, one by one, ensuring the application is always available. The goal was to give all the musicians a new sheet of music without interrupting the concert for the audience.
I ran the command to tell my Deployment to use the new image:
kubectl set image deployment/hello-server hello-app=us-docker.pkg.dev/google-samples/containers/gke/hello-app:2.0
I watched the status and saw that GKE was trying to start a new container with the 2.0 image, but it got stuck in a Pending state. This meant the new musician, holding the new sheet music, was waiting backstage, but there was no empty chair for them on the stage. The good news was that my old 1.0 replicas were still running, so my website was still live. But the update was failing.
To find out why, I used the most important command for Kubernetes troubleshooting: kubectl describe pod. This gives you a detailed event log for a specific pod.
kubectl describe pod <name-of-the-pending-pod>
The answer was in the Events section at the bottom. The analogy clicked instantly. My stage crew (Autopilot) went to the lumberyard (Google's infrastructure) to get wood for a new chair, but the yard owner said, 'Sorry, your account has hit its safety limit for today. No more materials for you.'
This was a huge learning moment. All Google Cloud projects have quotas, which are safety limits on resource usage. A rolling update temporarily requires more resources than normal because, for a moment, both old and new app replicas need to exist. My project's quota was too low to allow for this temporary increase.
Here’s the workaround I followed:
First, I undid the failed update to get back to a stable state:
kubectl rollout undo deployment/hello-server
Next, I made some room by scaling the old application down from 3 replicas to 2:
kubectl scale deployment hello-server --replicas=2
With the resources freed up, I ran the set image update command again. This time it worked, replacing the two old replicas with two new ones.
Finally, I scaled the now-updated application back up to 3 replicas:
kubectl scale deployment hello-server --replicas=3
It was a valuable lesson in how cloud resource limits affect application management. After that, I refreshed my browser and saw "Version: 2.0.0." The update was complete.
Final Thoughts
My first dive into GKE was a success. The most important takeaway for me was that the errors were the best part of the experience. Hitting the authentication and quota issues forced me to learn how to debug, which is a far more useful skill than just following a tutorial that works perfectly. I was also genuinely impressed with how GKE automates complex tasks like scaling and load balancing with just a few commands.
The concert is just getting started.
Subscribe to my newsletter
Read articles from Alex Nyambura directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
