What is Machine Learning? Basic concepts
Machine learning is a topic that has become widely used in the development world and since the use of this expression is so popular, we assume that everyone knows what it is about, which is not always true.
In this blog we will try to explain in a simple way what machine learning is, and we will talk about the general concepts that it involves.
A key concept to start with is that machines have intelligence, although this is non-conscious intelligence. That is, they can make decisions in seconds, based on mathematical calculations, that humans would take decades of work on paper, but they do it without realizing that they are doing it. Still, the machine’s ability to learn is remarkable, and they manage to learn by reinforcement, by imitation, and by deep learning. They even carry what they have learned from one environment to another.
However, machine learning is limited to very specific cases. For example, a machine that learns to play chess, even if it wins all the human opponents in the world, will not know how to drive a car. Any human will beat you at that task.
Another important thing to keep in mind is that machines don’t learn like we humans do. For example, imagine that a child sees a dog for the first time in his life and we tell him that it is a dog. No matter what breed, size or color it is, the next time he sees a dog he will know that it is a dog. On the other hand, a machine does not do it in the same way. It will need to analyze a large number of images of dogs of different breeds, colors, sizes, in different positions or views (sitting, lying down, side view, front view, etc.) to be able to tell if a new image that it has not previously seen corresponds to a dog.
So, for a machine to learn it needs to be trained. It is like the training that humans do to master a new skill, for example a beginning basketball player will probably fail repeatedly in his first shots, but after training thousands of times his shots he will improve and score much more frequently.
To summarize, machine learning is nothing more than a type of algorithm or program that uses large amounts of data to learn on its own, and although at the beginning of that learning human intervention is mandatory, then it can continue to learn, improve and do its tasks without human intervention.
Types of Machine Learning Algorithms
Now that we have a general idea of what machine learning is, let’s talk about the different ways that machines can learn. The two most widely used methods are supervised learning and unsupervised learning, although there is another type of machine learning such as Reinforcement Learning.
The choice of whether to use a supervised or unsupervised machine learning algorithm generally depends on factors related to the structure and volume of your data and the objective of the problem in question. A complex problem will typically use both types of algorithms to build predictive data models that help make decisions about a variety of business challenges.
This type of machine learning could be understood as algorithms that “learn” from the data entered by a person. In this case, human intervention is required to label, classify, and enter the data into the algorithm.
Supervised learning algorithms base their learning on a set of previously labeled training data. By labeling we mean that for each occurrence of the training data set we know the value of its target attribute. This will allow the algorithm to “learn” a function capable of predicting the target attribute for a new data set.
The case we saw in the introduction to this article, in which a machine is trained with thousands of images of dogs and thanks to this “learn” to identify whether an image corresponds to a dog or not, is an example of supervised learning. Another very common example of supervised learning is classifying incoming mail as spam or not.
Below we will briefly mention the most popular supervised learning algorithms.
- k-Nearest Neighbors (K-NN)
This algorithm classifies each new data into the corresponding group, depending on whether it has k closest neighbors to one group or another. That is, it calculates the distance from the new element to each of the existing ones, and orders these distances from smallest to largest to select the group to belong to. This group will therefore be the most frequent with the shortest distances.
- Linear Regression
In its simplest version, what this algorithm will do is “draw a line” that will indicate the trend of a set of continuous data. In statistics, linear regression is an approximation to model the relationship between a dependent scalar variable “y” and one or more explanatory variables named “X”. The learning consists of finding which are the best parameters (coefficients) for the data that we have. The best coefficients will be those that minimize some measure of error. For linear regression we will use the mean square error.
- Logistic Regression
Like linear regression, logistic regression is a machine learning technique that comes from the field of statistics. Despite its name, it is not an algorithm to apply in regression problems, in which a continuous value is sought, but rather it is a method for classification problems, in which a binary value between 0 and 1 is obtained. For example , a classification problem is to identify whether a given transaction is fraudulent or not. Associating a label “fraud” to some records and “not fraud” to others.
With logistic regression, the relationship between the dependent variable (the statement to be predicted) and one or more independent variables (the set of characteristics available for the model) is measured. To do this, it uses a logistic function that determines the probability of the dependent variable. As mentioned above, what is sought in these problems is a classification, so the probability has to be translated into binary values. For which a threshold value is used. For probability values above the threshold value the statement is true and below it is false.
- Support Vector Machines (SVM)
These can be used for both regression and classification. Support Vector Machines allow you to find the optimal way to classify between various classes. Optimal classification is done by maximizing the separation margin between classes. The vectors that define the edge of this separation are the support vectors. In the case that the classes are not linearly separable, we can use the kernel trick to add a new dimension where they are.
- Bayesian Classifiers
These algorithms are based on a statistical classification technique called “Bayes theorem.” These models are called “Naive” algorithms. They assume that the predictor variables are independent of each other. In other words, that the presence of a certain feature in a data set is not at all related to the presence of any other feature.
They provide an easy way to build very well behaved models due to their simplicity. This is accomplished by providing a way to calculate the ‘later’ probability of a certain event A occurring, given some ‘earlier’ event probabilities.
- Decision Tress
A decision tree is a predictive model that divides the predictor space by grouping observations with similar values for the response or dependent variable.
To divide the sample space into sub-regions, a series of rules or decisions must be applied, so that each sub-region contains the largest possible proportion of individuals from one of the populations.
If a sub-region contains data from different classes, it is subdivided into smaller regions until the space is fragmented into smaller sub-regions that integrate data from the same class.
Unsupervised methods are algorithms that base their training process on a set of data without labels or previously defined classes. In other words, no objective or class value is previously known, either categorical or numerical. Unsupervised learning is dedicated to the tasks of grouping, also called clustering or segmentation, where the goal is to find similar groups in the data set.
Incredible as it may sound, unsupervised learning is the ability to solve complex problems using only input data and logical algorithms, without ever having reference data.
As an example¹, let’s say you have an e-commerce site that sells products to customers. Suppose there are thousands of customers and you want to know if you can categorize or group customers into different types. For example, there might be power users who use more advance features of the site. There might be quick browser-type users who only look for a cheap discount and stay for very little time on the side itself. There could be sort of careful researcher type users who spend a lot of time comparing different items. In this case you don’t have any label examples, however, it is possible to take your data from how people interact with the site and use unsupervised learning to discover these different groups.
The most popular Unsupervised Learning algorithms are K-Means Clustering and Principal Component Analysis
- K-Means Clustering
K-means is a method that aims to generate a partition of a set of n observations into k groups. Each group is represented by the average of the points that compose it. The representative of each group is called the centroid. The number of groups to discover, k, is a parameter that must be set a priori. The clustering method starts with k randomly located centroids, and assigns each observation to the closest centroid. After being assigned, the centroids are moved to the average location of all the data assigned to it, and the points are reassigned according to the new positions of the centroids.
- Principal Component Analysis (PCA)
Principal Component Analysis is a Feature Extraction technique where we combine the inputs in a specific way and we can eliminate some of the “less important” variables while keeping the most important part all the variables. As an added value, after applying PCA we will ensure that all the new variables are independent of each other.
So this was a brief introduction to the basic concepts of machine learning, a field as interesting as it is extensive and very useful in various applications.