Machine learning for beginners: basic concepts, tasks and scope

Machine learning allows computers to do tasks that, until recently, only humans could do: drive cars, recognize and translate speech, play chess, and more. But what exactly is meant by the words “machine learning”, and what made possible the current boom in machine learning?
What is machine learning?
To solve each problem, a model is created that is theoretically capable of approaching the human level of solving this problem with the correct values โ€‹โ€‹of the parameters. Training this model is a constant change in its parameters so that the model produces better and better results.

Of course, this is just a general description. As a rule, you do not come up with a model from scratch, but use the results of many years of research in this area, since the creation of a new model that surpasses the existing ones in at least one type of problem is a real scientific achievement. Methods for setting the objective function, which determines how good the results produced by the model (loss functions), also occupy entire volumes of research. The same applies to methods for changing model parameters, accelerating learning, and many others. Even the initial initialization of these parameters can make a big difference!

In the learning process, the model learns features that may be important for solving the problem. For example, a model that distinguishes between images of cats and dogs may learn the feature “hair on the ears”, which is more likely to be present in dogs than in cats. Most of these signs cannot be described in words: you can’t explain how you tell a cat from a dog, right? The selection of such features is often no less, and sometimes much more valuable than the solution of the main problem.

How is machine learning different from artificial intelligence?

The term “artificial intelligence” was introduced in the 50s of the last century. It refers to any machine or program that performs tasks “usually requiring human intelligence”. Over time, computers coped with more and more new tasks that previously required human intelligence, that is, what was previously considered “artificial intelligence” gradually ceased to be associated with it.

๐Ÿค– Machine learning for beginners: basic concepts, tasks and scope
Machine learning is one of the methods for implementing artificial intelligence applications, and with its help, artificial intelligence has advanced significantly. But, although this method is indeed very important, this is far from the first significant step in the history of artificial intelligence: once upon a time, expert systems, inference, and much more seemed to be equally important. Undoubtedly, in the future there will be new artificial intelligence technologies that will not be related to machine learning.
Models, types and parameters of machine learning

The simplest model has only two parameters. If you need to predict the result that depends linearly on the input feature, it is enough to find the parameters a and b in the straight line equation y=ax+b. Such a model is built using linear regression. The following figure shows a model that predicts a person’s self-assessed “happiness level” as a function of their income level (red line):

A model that predicts a person’s level of happiness based on their income level

Unfortunately, in real life, simple linear relationships are extremely rare. Even this graph shows that a high level of income is knocked out of a linear relationship – money alone is still not enough for happiness. Even polynomial models with the number of parameters equal to the degree of the polynomial are suitable only for very simple problems.

The modern revolution in machine learning is associated with neural networks, usually with thousands or even millions of parameters. Such networks can learn very complex features needed to solve complex problems. The following figure shows an example of a neural network architecture with two hidden layers.

Neural network architecture

Although the backropagation algorithm was invented quite a long time ago, until recently there were no technical possibilities for implementing deep neural networks containing a large number of layers. The rapid development of microelectronics has led to the emergence of high-performance GPUs and TPUs capable of training deep neural networks without supercomputers. It is the widespread adoption of deep learning that is behind the boom in artificial intelligence that you hear about everywhere.

Types of machine learning

Machine learning requires a lot of data. Ideally, the training data should describe all possible situations so that the model can prepare for everything. Of course, this is impossible to achieve in practice, but you need to try to make the training set sufficiently diverse.

The training strategy is chosen depending on the task and the data available for training. There are supervised learning, unsupervised learning and reinforcement learning.

๐Ÿค– Machine learning for beginners: basic concepts, tasks and scope

Learning with a teacher

This is learning by example, in which the “teacher” refers to the correct answers, which, ideally, the model should produce for each case. These answers are called labels (the name comes from classification problems, the models of which are almost always trained with a teacher – there these answers are class labels), and labeled data.

Supervised learning is learning by example. Imagine that a student was shown several methods for solving problems, and then forced to solve a huge number of such problems, providing the correct answers for each. If the student solves all these problems, and gets the correct answer in each, then we can assume that he has mastered the methods for solving such problems.
Unfortunately, things are not so simple with machine learning models, because we ourselves do not know which answer will be the “correct” one for each case! After all, it is to obtain these answers that we need a model. And almost always, we need the model to learn well the dependence of the result on the input features, and not exactly repeat the results of the training set, which in real life may contain erroneous results (noise). If a model produces correct results on the entire training set, but often fails on new data, it is said to have overfitted on that set.

An overtrained classification model (green line) produces correct results on the entire training set, but a correctly trained model (black line) is likely to be less wrong on new data
An overtrained classification model (green line) produces correct results on the entire training set, but a correctly trained model (black line) is likely to be less wrong on new data
Learning without a teacher
Some problems can be solved without labeled training data, such as clustering problems. The model itself decides how to group the data into clusters so that similar data instances fall into one cluster, and dissimilar ones do not fall into one.

Such a learning strategy is used, for example, by Airbnb, grouping similar houses into groups, and Google News, grouping news by their topics.

Partial involvement of the teacher

As the name suggests, semi-supervised learning is a mixture of supervised and unsupervised learning. This method uses a small amount of labeled data and a lot of unlabeled data. First, the model is trained on labeled data, and then this partially trained model is used to label the rest of the data (pseudo labeling). The whole model is then trained on a mixture of labeled and pseudo-labeled data.

This approach has exploded in popularity recently due to the widespread use of generative adversarial networks (GANs), which use labeled data to generate completely new data on which to continue training the model. If partially supervised learning ever becomes as effective as supervised learning, then massive computing power will become more important than large amounts of labeled data.

Reinforcement learning

This is learning by trial and error. Each time the model achieves the goal, it receives a “reward”, and if it does not receive a “punishment”. This strategy is usually used to train models that interact directly with the real world: models of automatic driving cars, playing various games, etc.

A great example of a reinforcement trained model is Google’s Deep Q neural network, which has beaten humans in a lot of old video games. After a long training, the model learns the correct strategy of behavior leading to victory.