I know this stack will work well for my purposes because I’ve learned what developers use in this space.
Step 1. Explore one project that is similar to what you want to create
Now that you know exactly what you want to do and have a rough idea of how you’ll do it, it’s time to refine the details.
The best way to learn how something is done is to watch a real master at work. You can think of it as asynchronous apprenticeship.
Being able to see what can be achieved and what the end result might look like will give you more context to study than any theory.
For this purpose, it is best to go to GitHub or Kaggle and look for publicly available projects. Browse through a few of them until you find the one you like.
It can be a full-fledged library, a simple analysis, or a ready-made AI. Whatever the case, find a few different projects and then pick the one that interests you the most.
Once you’ve found the right project, spend some time familiarizing yourself with its documentation, codebase structure, and code. Most likely you will get lost. Especially if you are not very good at programming. But in this way you will learn a lot of new things, and learning new things is good and pleasant!
Make notes about recurring patterns you see, interesting bits of code you understand, and topics you don’t understand at all. Bookmark this project and come back to it as you progress along the learning path.
A good place to start looking is this list on GitHub. But you can just use the search on Kaggle or GitHub. Search for keywords related to your ML interests.
A good simple project by Thomas Kipf will do for my curriculum. It’s simple enough that I can walk through it and understand what’s going on in each section, learning the basics of the structure along the way.
Step 2. Learn a programming language
Now that you have a clear idea of where you need to go next and what to learn to get there, it’s time to learn how to understand the code.
The code will most likely be in Python. However, it can be Julia, C ++, Java – it all depends on what you want to learn and what project you bookmarked.
Whatever language it is, you should spend some time learning the basics to understand how to write scripts.
A very good course to learn Python, enough to get you started with the language, is Scientific Computing with Python by freeCodeCamp. You can also try Kaggle’s very short Python course.
You don’t need to 100% understand how the language works. Simply, while going through the machine learning training cycle, try to regularly devote a little time to improving your knowledge in the chosen programming language. Thus, learning becomes iterative.
For the purposes of my curriculum, the freeCodeCamp course is fine.
Step 3: Explore Libraries Top Down
In machine learning curricula, I often notice that after learning the basics of machine learning, they move on to implementing algorithms from scratch.
I think this is a great project to do on your own. But I don’t think that this should be the main focus in the early stages of learning ML.
The fact is that almost no one implements algorithms from scratch, with the exception of people who create packages that developers use. Even so, they often rely on other packages created by linear algebraists to do most of the low-level work.
My point is that understanding what’s going on under the hood is extremely helpful, but I don’t think that should be the goal of a newbie.
At this point, I suggest learning the highest-level library for your chosen programming language that will allow you to achieve the results you want. To create something working, it will be enough for you to study how this library works.
Of course, at this stage you will not have an understanding of why something works or does not work, but this is not very important.
It is much more important to be able to work with the tools that ML specialists use in their daily work. Once you understand what a high-level library does, move on to a slightly lower level library.
When doing this, be careful not to get too deep into the library (if you got to LAPACK by reading about Fortran, you’ve gone too far!).
For my project, the main library I need to learn is Pytorch or a higher-level wrapper of it, so the fast.ai practical course will be on topic.
Step 4. Create one project you are passionate about in a maximum of one month
Let’s move on to the stage where most of the learning takes place. At this point, you should already have the minimum knowledge to create at least a slightly useful project.
For reference, if you feel absolutely confident about taking on a project, you haven’t gone through steps 0 through 3 fast enough.
Think about what interests you, what you really would like to create. Don’t get too carried away: you have a maximum of a month for this project.
Mark the deadline on your calendar. When working on a project, the time limit is motivating and stressful enough to keep you going.
The idea here is to, when faced with difficulties, discover the main knowledge gaps and experience what a real machine learning developer experiences.
By working on your own, i.e., without resorting to a course or a book, you will be able to complete the difficult parts of the project work that you usually skip if you follow the tutorials:
- planning, scoping, tracking the progress of your ML project
- reading online documentation of libraries
- reading threads on StackOverflow and GitHub, posts on a randomly found blog of some
- developer, and posts on a cryptic help forum – all to deal with one bug
- creating a project in a non-optimal way with subsequent improvement
- fixing problems with overfitting, underfitting, and generalization.
For a fun project, try these three little exercises:
- think carefully about what interests you now
- see list of project ideas
- pay attention to open datasets.
All this together will give you an understanding of what can be created at all. And by combining this with your interests, you can create something truly yours.
This GitHub list can be a great place to find inspiration when creating a mini-project. You can use Google Dataset Search to find the right data for your project.
Don’t underestimate the importance of data!
Even if you have very good ideas, the lack of data will seriously hinder your progress.
For my purposes, I found this neat dataset of a mining company’s global supply chain. My project will involve using graph neural networks to determine excavator sales prices, which is the central theme of this dataset.
Step 5. Identify one gap in your knowledge and fix it
At this point, you’ve already spent some time developing your project and are very impressed with how far you’ve come. You probably haven’t gotten close to what you envisioned yet, but you’ve already encountered countless challenges along the way.
Now you realize how little you really know and that there are gaps in your knowledge that need to be filled.
This is great! Make a list of all gaps you find and rank them in order of perceived priority. This may be difficult for you, as everything will look equally important at this stage. But learning to make informed decisions about what to study next is almost as valuable as learning itself.
Now for the weirdest part: remove everything from your list except the most important topic.
When I say delete, that’s what I mean. Delete everything except item #1. In the next iteration of the loop, your current estimate of what to learn will be mostly wrong. You will miss other, more important knowledge that you are not aware of now.
Now that you only have one topic left to study, give yourself one day to one week to do it. It may seem like a very short time, but you do not need to study the topic thoroughly, but just enough so that this knowledge can be used in the next round of training.
In practice, it may happen that you dive deep enough into this topic to notice how it is related to other important topics (such as probability, statistics, or even dull linear algebra).
Pay close attention to these connections, revisit related topics if you wish, and strengthen your machine learning mental model to make it more accurate.
Step 6: Repeat steps 0 to 5
Your first run through this pipeline is likely to be so-so at best. But in a very short period of time, you will learn much more than you could learn by going “from the bottom up”.
On each new pass of the cycle, the “exhaust” will rapidly increase. Each new round will be easier, and the overall picture will become clearer.
This approach is based on the lean methodology that I learned to apply with great success in my startup. Doing multiple iterations is the fastest way to reach your goal.
In the course of a year, you may be able to go all the way 12 times, which means 12 machine learning projects and a very practical understanding of the field.
Thanks to this, you will become an attractive candidate in the labor market, and in addition, you will understand what you need for further development.
Results
So, if you want to learn machine learning in practice, you should:
Understand what the field of machine learning is and mentally map it.
Find a cool project similar to what you yourself would like to do and study it.
Learn the required programming language.Learn enough libraries to do something useful.
Create a project in a week (month).
Identify the single biggest gap in your knowledge and fill it.
Repeat!
I hope this article is helpful to you. If you want to get acquainted with individual topics of machine learning, pay attention to my YouTube channel. Good luck!