Types of machine learning

1741-machine_learning_1.jpg

Machine learning as the next round in information technology

The inception of machine learning began under the simple, yet formidable,  premise of “teaching” machines how to act and react under specific circumstances with minimal to no human intervention. Basically, machine learning is the study of algorithms and scientific models that help a machine perform tasks, identify patterns, predict outcomes or make decisions, without explicit instructions or programming.

In a previous article, we talked about the fundamentals of machine learning. Here, our focus is centered around the different types of machine learning available and what each one is all about.  Before we get into it, let’s take baby steps and understand the relevance of the machine learning cog in the high-functioning wheel that is Information Technology (IT).

Information Technology is by far one of the most revolutionary industries in the world.  With numerous technologies emerging on a daily basis, it is challenging to keep up with the constant changes in the tech landscape. This is certainly true in the area of Machine Learning. As it continues to evolve, Machine Learning is still defining its place in the next round of  IT.  It provides businesses with the powerful ability to use algorithms to build models that help uncover potential with minimal human intervention. Business organizations reap the benefits from the merging of Machine Learning with IT as they can now analyze bigger, more complex data sets to deliver results with a higher degree of accuracy.

In the upcoming section, we’ll cover how machine learning algorithms help solve IT problems.

The role of machine learning algorithms in current IT problems

More data means more questions, but that’s a good thing because we also get more answers. With the surge of Big Data, Machine Learning has expanded its capabilities to solve problems in many different areas of  IT. Through the use of algorithms, Machine Learning can discover patterns in data to help users generate insights and ultimately make better decisions and predictions.

From critical decisions in medical diagnosis, trading, energy forecasting, and more, machine learning scrutinizes millions of data sets to help solve complex problems that include numerous variables.

Types of machine learning: Reinforced, supervised, and unsupervised machine learning

The IT industry widely recognizes three core types of machine learning, but for this article, we will include a fourth one that is gaining momentum.  These four types are: reinforced machine learning, supervised machine learning, unsupervised machine learning, and, last but not least, semi-supervised machine learning.

Let’s explore each one.

Reinforced machine learning

The reinforce or reinforcement machine learning type involves algorithms that discover, through trial and error, which actions yield the best results. This machine learning type is composed of an agent, an environment, and an action. The agent is the learner or decision maker who must take actions in an environment to maximize rewards. 

Training in reinforcements differs from standard supervised training in that the pair of correct inputs/outputs are never presented in a direct form, and non-optimal actions are not corrected.

The environment in reinforcement machine learning is usually formulated as a Markov decision process (MDP) since many algorithms for training with reinforcements for this context use dynamic programming techniques.

The basic training model with reinforcement consists of:

  • set of states of the environment
  • set of actions
  • rules of transition between states
  • scalar immediate reward for transitions
  • rules that describe what an agent is watching

And then algorithms for reinforced machine learning are used:

  • brute force
  • criteria for optimality with discount-factor (Monte Carlo, temporal difference)
  • value functions
  • search as direct policy

In addition to reinforcement learning, there are its extensions:  deep reinforcement learning, inverse reinforcement learning, and apprenticeship learning.

Reinforcement machine learning is mainly used in game theory, robotics, control theory, navigation, operation research, simulation-based optimization, swarm intelligence, and genetic algorithms. 

Supervised machine learning

Supervised machine learning creates a model to make predictions that are based on data. A supervised machine learning algorithm uses a set of input data and output data to train a model to produce reliable predictions as the response to new data. Supervised machine learning is best leveraged when you have known datasets for the output you are trying to predict.

Supervised machine learning uses classification and regression techniques to generate predictive models - we will explore these techniques in an upcoming section.

Discussions of the most popular algorithms for supervised machine learning usually consider the following:

  • support Vector Machines
  • linear and logistic regressions
  • naive Bayes
  • linear discriminant analysis
  • decision trees, especially Random Forest 
  • k-nearest neighbor algorithm
  • similarity learning

As a separate approach in the list of algorithms for supervised machine learning, there are a lot of methods based on Neural Networks (Multilayer perceptron).

If we need a scoring function for supervised machine learning, then we can select between empirical risk minimization and structural risk minimization. 

For training methods, this type of machine learning works with discriminative training and generative training.

Selecting the algorithm, the scoring function, and discriminative or generative training depend on the given size and type of problem.

The distinction between supervised machine learning and unsupervised machine learning is easily demonstrated by the presence or absence of labels - let’s further clarify how in this next section.

Unsupervised machine learning

Very much the opposite of supervised machine learning, unsupervised machine learning is the practice of using data without historical identifiers to discover patterns or intrinsic structures. It draws conclusions from input data that features no labels. Unsupervised machine learning is data-driven as it learns to cluster, group, and organize data to gain insights.

Clustering and anomaly detection are two important tasks for unsupervised machine learning.

Unsupervised machine learning also works with the following:

  • Principal Component Analysis (PCA)
  • anomaly detection
  • autoencoders
  • Deep Belief Nets
  • Hebbian Learning
  • Generative Adversarial Networks (GANs)
  • self-organizing maps

Some applications of unsupervised machine learning include the areas of buyer behavior and habits and recommendation systems. 

Unsupervised machine learning uses clustering techniques to discover hidden patterns - we will explore these techniques in an upcoming section.

Semi-supervised machine learning

Semi-supervised machine learning is used very similarly to supervised learning - the difference lies in the fact that this machine learning type can accommodate the use of both labeled and unlabeled data. This machine learning type is used with techniques such as classification, regression, and prediction.

The biggest problem of semi-supervised methods is processing unlabeled data, but even a small portion of labeled data can improve results.

The main semi-supervised methods are: 

  • Generative models
  • Low-density separation
  • Graph-based methods
  • Heuristic approaches

A very interesting direction of semi-supervised machine learning uses the statistical principles of learning and Vapnik-Chervonenkis theory.

Semi-supervised machine learning is an effective approach because labeled data gives more information about a studied object and allows tasks to be solved more precisely and faster.

Machine learning techniques

Let's review some of the key machine learning techniques used with different types of machine learning approaches.

Machine learning regression

Machine learning regression techniques predict ongoing responses such as fluctuations in temperature. They are commonly used with data ranges and with real number outputs. Some of the most common applications of regression techniques include energy or power forecasting, trading, and temperature forecasting. Regression algorithms include the linear model, nonlinear model, boosted and bagged decision trees, neural networks, and adaptive learning, to name a few.

Linear regression is a method for modeling the relationship between a real value y and a vector (in the general case) variable X. If the variable X is also a scalar, then it is a simple regression.

The parameters of the linear regression model usually are calculated by the method of least squares.

A very interesting type of regression is Bayesian linear regression. This is an approach to linear regression using the Bayesian probability.

Machine learning classification

The machine learning classification techniques categorize, tag, and separate data into groups and are customarily used to predict discrete responses, such as determining the authenticity of an email. Some of the most common applications of classification techniques include medical imaging, speech recognition, and credit scoring. Classification algorithms include a support vector machine (SVM), decision trees ensemble, Naive Bayes, discriminant analysis, logistic regression, and neural networks. 

In machine learning classification is a part of supervised learning, where both input and output information is known. A problem of classification can be formulated as binary and multiclass classification.

Machine learning clustering

The most widely used technique in unsupervised machine learning is clustering. Machine learning clustering uses data analysis to discover hidden patterns or groupings in data sets. In machine learning, clustering belongs to unsupervised learning as an opposite task to classification. Some of the most common applications of clustering include genetics, market research, object recognition, and more. Clustering algorithms include k-means, hierarchical clustering, Gaussian mixture, self-organizing maps, subtractive clustering, and more.

There are two strategies for hierarchical clustering algorithms: agglomerative and divisive. 

Different metric and linkage criteria will produce different values and quality of grouping so care should be taken to match specific problem.

Benefits and challenges of machine learning in IT solutions

Here are some of the benefits of machine learning in IT solutions:

  • Discovery of trends and patterns. Examining monumental volumes of data is made easy with machine learning techniques. This examination helps discover new trends and patterns that highlight important relationships between data sets that would, more than likely, be very hard to pinpoint with human involvement alone.
  • Continuous improvement: Within the context of Machine Learning, algorithms can continuously improve the performance of a system by using insights from historical data. With an ongoing influx of information, machine learning systems gain experience - supporting better, more informed decision-making and improving themselves constantly.
  • Responsiveness and adaptive behavior without human intervention: With machine learning technology, systems are continuously gathering knowledge to improve without human intervention. As more data comes into play, machine learning systems develop more strategies to adapt quickly and be responsive enough to act in the face of new challenges.

Here are some of the challenges of machine learning in IT solutions: 

  • Likelihood of error susceptibility: Because the essence of machine learning is to have an autonomous system, the late discovery of errors can be a problem, causing great stress on a system. If there are faulty data sets, it may be hard to identify these corrupted bits of information and they can render an entire system useless.  Time and monetary investment can be high: Because machine learning is data-driven, massive amounts of data are involved, requiring a lot of time and resources.   This investment can be costly and time-consuming, similar to a training period.
  • Excessive automation: As incredible as it seems, there is such a thing as too much automation. With autonomous systems and processes, machine learning systems are taught to operate on their own,    and this has a lot of developers shaking in their boots as they relinquish control.

Machine learning is implemented in different libraries and frameworks, for instance, Matlab, PyBrain, Tensorflow, R, scikit-learn, SciPy, Octave, and NumPy.

Conclusion

With this article, we hope to get you started on the essentials of the different types of machine learning along with the algorithms that solve IT problems.

Machine Learning in its best state is achieved through the right combination of tools, techniques, and algorithms. This combination is no stroke of luck; it requires knowledge to implement a machine learning process that is suitable for your needs. Also, it helps to have a data scientist or analyst resource.

Choosing the right type of machine learning can seem overwhelming - along with the numerous machine learning algorithms associated with each one, as they all entail different approaches to learning. There is no one-size-fits-all solution -  the right algorithm and type of machine learning are oftentimes discovered through trial and error. 

Experienced data scientists try out algorithms to see what works best under different circumstances before committing to a specific type, taking into consideration the size and the type of output you want to obtain.

To harness the power of the different types of machine learning, Svitla Systems provides you with experienced personnel who will help you use data to make better decisions through the power of the right combination of machine learning techniques. Our company has experience in different business areas to provide qualified solutions in machine learning, including neural networks. With Svitla Systems as your partner, machine learning endeavors are made easy, fostering the ideal environment to apply machine learning to data analytics and other tasks.