Week 4 - Troubleshooting AWS CloudFormation Scripts
In this post, I'll be sharing a recap of my notes from Day 22 to 26 of learning in the Udacity Cloud DevOps Engineer Nanodegree classroom.
These are compiled from my daily IG posts, as part of my #3MonthsOfCDE series and only slightly edited for platform suitability.
Day 22 - Troubleshooting Day 1
After several days of frustration, troubleshooting, and a million and one trials, I FINALLY figured out why my attempts at two exercises weren’t running like it did for the teacher.
In the hopes of finding a fix, I spent endless hours scouring the internet, filling the “gaps” in my networking knowledge, running multiple scenarios AND creating and deleting several stacks.
a) Exercise 1
If you recall from my post on day 19/20, I was retrying Challenge 1. Instead of using an existing VPC and one of its subnets from the management console as used in the exercise, I wanted to try provisioning all the resources from scratch. That was the beginning of all my troubles.
I kept troubleshooting but I still could not figure out why I was having connection timeout errors and the “web server did not deploy”. I was so sure I was doing everything right because I could recreate the same configuration on the AWS console and that worked fine; so I knew it had to be something in my script that was causing the problem.
Turns out I was partially right (not about the web server though, that was working fine). Something in my script was indeed causing the issue; more accurately, things were NOT in it.
I TOTALLY FORGOT to include a route table and route for my public subnets!!!
- There was no connection between the subnets where my instances were, and the internet gateway. Of course, I would not have been able to reach the server!!
To the author of that one article that made me realise that colossal error, thank you for writing it. This is another reason why it’s important to document your experiences as a developer/engineer (or anyone!).
Day 23 - Troubleshooting Day 2
For the second exercise, the fresh perspective from resolving the first one made it easier to troubleshoot.
It turned out to be a similar scenario; this time, I was not correctly configuring ports on the security groups, load balancer and its listeners. With some additional configurations and more troubleshooting, I resolved that exercise as well. I was more than happy to conclude things and finally delete the stacks.
It was indeed a lot. I’d put some blame on Udacity though; there were discrepancies in the ‘guide’ provided for the course vs. what the teacher actually used to make his own script run👀. This whole brouhaha set me back several hours on my study schedule because I could not let it go. It kept bugging me and I needed to figure things out so I could be confident in what I had learnt.
I’m glad I went through it though because sometimes, things learnt from lived experiences stick better. I was able to learn a whole more about IP addresses and CIDR blocks (+ reserved classes). And, with all the time spent on AWS documentation and related articles, I had a solid run.
One thing I could have done better was reach out to my session lead; Yes, I wanted to make sure I had done enough troubleshooting but it could have saved me some time. I guess I also did not want a situation where the problem would have been a trivial issue and it’d seem like I didn’t troubleshoot at all.
I need to work on such thoughts. It’s okay to not know things; we learn everyday.
Day 24 + 25 - Working on the Second Project
Day 24 was another connect session and for the first time, we got a project as a take-home assignment. I'm still not sure what I would put together or when I’d be able to do so yet - I have quite a list of commitments for the week. We’ll see how it goes anyway.
I also started working on the second project - "Deploy a high-availability web app using CloudFormation". By the evening of Day 25, I managed to submit it. I hope it “gives what it’s supposed to give” for them. Fingers crossed.
I requested for a well-detailed feedback as well as some questions I needed answers to. Hopefully the review will be out by tomorrow, I’d have passed and I’d also have my answers.
Day 26 - Project Review
My review came back on Sunday night, but with requested changes. For some reason, I must have thought they were going to run the script or something because it did not occur to me to leave the stack running so they could actually test a deployment url. There was also a comment on copying files deployed from a S3 bucket to an EC2 instance with a running web server, and this was where I spent quite some time.
You see, I am not exactly familiar with scripting even though I’m able to understand some commands. I needed to write the script in the UserData section of the auto-scaling group, which would be responsible for spinning up the instances, but I also had to ensure the AWS CLI was installed first (the AMI was Ubuntu).
It took some time but I eventually sorted that out as well. Finally, I put together a README for the repo I was using and resubmitted.
You can find my project files in this repository.
I passed the second time; I still think they use a bit too many emojis in the reviews and the words are a bit exaggerated. Now, I’m moving on to the next lessons (I still have one more lesson under Course 2, so I’d complete that first before going to Course 3).
I am also saving some of my practice exercise solutions in another repository. You can check them out here.
P.S If you come across this article, kindly recommend some scripting courses (both Bash and PowerShell).
Day 22-26: June 3, 2022 - June 7, 2022.
Subscribe to my newsletter
Read articles from Airat Yusuff directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Airat Yusuff
Airat Yusuff
Software Engineer learning about Cloud/DevOps. Computing (Software Engineering) MSc. Computer Engineering BSc. Honours.