Week 7: Debugging A Blue-Green Deployment Pipeline + Monitoring Environments
In this post, I'll be sharing a recap of my notes from Day 41 to 45 of learning in the Udacity Cloud DevOps Engineer Nanodegree classroom (and it was a lot 😢).
These are compiled from my daily IG posts, as part of my #3MonthsOfCDE series and only slightly edited for platform suitability.
Day 41 - Working on an Entire Blue-Green Deployment Pipeline
I spent all of yesterday working on the exercises in Lesson 4 and once again, I realised just how hard it is to learn things as an absolute beginner.
First, I tested out the “Build Ansible inventory file” step automation (from the previous article) with my personal account and it ran as expected. Afterwards, I did some research on how to run the workflow with the Udacity federated user. It turns out I just had to create an IAM user, like the one I used with my personal account.
It turns out that I could use the user's access key credentials to set up environment variables for CircleCI (instead of the Udacity temporary credentials), and use those variables with the jobs that needed aws-cli.
Afterwards, I continued working on the exercises. How did that go?
Let's just say my CircleCI account with the endless failed pipelines speaks for me because I ran into a lot of different errors and I was quite frustrated. Some of the errors took a long time to spot and fix while others were silly and spelling-related (thanks to my frustration).
By the end of the day, I had worked on three exercises, and only two of them finished successfully.
Day 42 - Troubleshooting Blue-Green Deployment Pipelines
This day turned out much better than Day 41 because:
a) I figured out the issues with the third exercise,
b) I was able to complete the rest of the exercises (with lots of failed attempts and pipelines, of course), and
c) I now have a fully functional running pipeline, plus some extra steps I added to some cleanup jobs for better functionality! 🎉
- It did not come without frustration but I am glad I fully went through Lesson 4 and all that it had to teach me. It almost always makes carrying out the project a less bumpy ride.
Spoiler alert: I faced entirely different issues with the main course project.
- I have also started Lesson 5 on Monitoring Environments and I hope that ends well, so I can get started on my project (and hopefully get it done and approved by weekend). Sayonara!
Another Spoiler: I did not. It is currently still ongoing.
Day 43 - Monitoring Environments With Prometheus
What happens when you get 2 days off a work week and you resume?
Well, I did not spend much time with the last Lesson I started on "Monitoring tools", because there was a backlog of tasks that needed to be completed and pushed to production.
Still, I went through the introductory sections and I am now currently working on the Prometheus setup exercise (Prometheus is the tool of choice for the course).
I also took a look at the Project contents to have an idea on what I would be working on. Things look familiar now, so I am almost ready!
Day 44 + 45 - Working with Prometheus
It’s safe to say I went through IT with this last lesson!
While I understand that there are different ways of doing things, the materials in the lesson were quite confusing. The instructor videos used a setup for the Prometheus server (+ node_exporter and alert managers) that was entirely different from the “tutorial links” provided to replicate.
At first, I chose the setup in the videos because it was faster and required less configuration (plus the tutorial links seemed quite old). However for some reason, I reverted to using the tutorial setups and that wasted my time.
Of course, it could be that I did not set things up correctly (and I am certain I did), but the issues were most likely from the outdated approach. I eventually finished the lessons by reverting back to the first setup in the instructor videos. As always, I encountered errors whilst setting things up but they all worked after the fixes.
There was a particular situation that took more time though; setting up alert managers with different receivers (email and Slack). Something kept failing - the server was reporting the error and with the rules I defined, I was supposed to get an email alert for that particular error. However, the emails were not getting sent.
I am still not completely sure what actually fixed things; I had to debug my
rules.yml
,alertmanager.yml
files AND remember to start up the alert manager🤦🏾♀️, etc. It was most likely a combination of everything I did.The exercises from the lessons are now completed and I started working on the main project.
P.S Dear Udacity, maybe update links in contents often? Or put disclaimers to correct things said in the video that are now outdated?
Subscribe to my newsletter
Read articles from Airat Yusuff directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Airat Yusuff
Airat Yusuff
Software Engineer learning about Cloud/DevOps. Computing (Software Engineering) MSc. Computer Engineering BSc. Honours.