Have to Start Somewhere

Sometime in January to March of 2025, I saw a video of this guy who manipulated music output based on his hand movement. I thought that was intriguing, and I knew I could replicate something similar (and better, duh). But I had no knowledge or idea of where even to start.
Months later, I wanted to do a Computer Vision (CV) project, but I had never done a proper tutorial. In late May 2025, on the way to Indiana for the Indy-500, I was determined to do a small tutorial to get some knowledge under my belt, so I watched and did a 2-hour tutorial.
As I was getting bored with the tutorial, I was ready to just start the hands-on process. The next couple of months, I got packed with work, socializing, and laziness, so I forgot about the tutorial and the framework I had already built out.
This was the tutorial, I only watched like 2 hours or so to understand the hand movements:
Key facts to know going forward would be:
The mediapipe library creates 21 markings on the hands that are detected.
Palm: 0
Thumb: 1-4
Index: 5-8
Middle: 9-12
Ring: 13-16
Pinky: 17-20
It would look something like this:
4 8 12 16 20 3 7 11 15 19 2 6 10 14 18 1 5 9 13 17 0
July 23rd, I decided to finally cut the crap and start building this out. So I reread the code I already wrote out and started by drawing a line from the left hand and right hand that are connected at the palm (0).
I might start referring to each point by just the number, so keep up.
With the line drawn, I wanted to find the distance, which I did so with the basic dist = sqrt(x² +y²) (Thank you, Mrs. Kaplan and Brown)
I’ll make another blog page sometime to give the other updates.
Subscribe to my newsletter
Read articles from Kevin Mohammed directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
