learning deep learning

It's been about 2 weeks since I decided to actively work on my long-term goal of becoming a deep-learning engineer. And there are a handful of catalysts that spurred my motivation:

My transition into my remote MS degree in data science in the Fall
My transition into a data engineering position in the Summer
My "need" to become an expert on all things machine learning

And if I learned anything from my career as a student, I by no means am gifted, so the earlier I can REPL an ML model the better.

Who am I?

My name is Joel Montano and I am finally (I say finally because it took me 8 years to get a 4-year degree) graduating with a BS in Computer Science in May. And as one life goal closes another life goal opens, becoming a world-class machine learning engineer.

Even with a degree, I still consider myself a novice engineer, so I thought it'd be interesting to chronicle my journey to becoming an expert (which will also give me a chance to work on another lifetime goal of mine, to become a writer).

https://media.giphy.com/media/XIqCQx02E1U9W/giphy.gif

My Learning Strategy

Let's propose a scenario, where I try to learn convolutional neural networks (CNNs). There are two ways I could approach this. First, top-down, where I would look around for some images, an implementation of a popular CNN model, such as resNet50, and some already implemented data preprocessing methods. After a couple of hours, I would have a deep learning model that was considered world-class about 10 years ago.

https://media.giphy.com/media/fhAwk4DnqNgw8/giphy.gif

Meanwhile, if I were to use a bottom-up approach, I could start by looking up some popular CNN papers, specifically the resNet's paper, and read post after post on topics like skip connections, 1x1 convolutional filters, max pools or average pools, feature maps, strides, and so on until the dawn of math. Finally, after a handful of weeks of learning, aggregate some data and build a model (only after implementing it from scratch).

https://media.giphy.com/media/c7PcKQlOqZ8Ws/giphy.gif

Top-Down Learning

Pros	Cons
I can start building practical things quickly	Limited by APIs
I can showcase to the world a lot quicker	Seeing technical terms can confuse me

Since my focus leans more towards development and I already have a solid computational foundation, due to my degree, I'll probably make this about 80% of my approach.

Bottom-Up Learning

Pros	Cons
Easier to get into technical articles	It might be weeks before I build something useful
Can debug models quicker	I might get discouraged
Enables building things from scratch

20% of my time will be committed to understanding the theory. My split isn't to down on a bottom-up approach, but rather, to tailor my learning to my strengths and weaknesses. Realistically, at least for me, abstract concepts tend to marinate more deeply when informed by hours and hours of trial and error.

WEEKLY CHALLENGE ALERT

So, I am currently trying to understand the basics of CNNs and there are a handful of goals I'd like to accomplish this upcoming week:

Go over FastAI's lessons 3 and 4
learn some more of nbdev and try to build a portfolio highlighting my deployed dog/cat Classifier on my Github page
Review FastAIs Documentation specifically the data portion
Look to see if there are some Kaggle competitions I can join
Write a blog post on materials covered from lessons 1 and 2
Understand the history of CNNs and some basic concepts surrounding them
Review the "Imagenet classification with deep convolutional neural networks" paper by Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton
Try to Implement a basic CNN using Pytorch

I don't expect to get through all of these goals in a week, but I'll list them here regardless to help stretch myself as the week goes by.

Additional Resources

To give some credit where credit is due, I've linked some books, articles, and videos that helped inspire this journey:

Rachel Thomas wonderful article on blogging and it's benefits
Jeremy Howard lesson 0 for one of the editions of part 1 fastAI's course
Metalearning from Radek a deep learning researcher at Nvidia and alumni of FastAI

1 to N Journey to Deep Learning Engineer