Before we get into different types of learning, let's see how do we solve a practical problem using machine learning. Generic notion of ML would be solving practical problems. There are two main steps for this
Data collection is the process of gathering and measuring information from different sources. Gathering data is important step in ML problem. There are four different types of data: text, time-series data, numerical data and categorical data.
Here we build a statistical model based on the same data set using some kind of machine learning algorithm. And this model is not random, it is build in such a way that it would be able to solve this practical problem.
There are different types of learning based on the outputs. For example: regression,classification, ranking etc.Here we discuss about different categories of learning depends on the input. There are four types of learning
Dataset \(\rightarrow\) Person
Each example \(x_i\)\(\rightarrow\) represents a particular person
Feature value \(x_i^{(j)}\) \(\rightarrow\) \( (x_i^1\)- height, \( x_i^2\)- weight,\( x_i^3\)- blood sugar,..)
Here similar j value gives similar information. For example \( x_1^1\) and \( x_2^1\) represents height of 1st person and height 2nd person respectively.Labels can be classified in three ways
In unsupervised Learning the computer is trained with unlabeled data. Here dataset(\(\big\{x_i\}_{i=1}^{N}\)) are just feature vectors and there is no labels attached with it. In this model we input feature vector and the model transforms the vector into some value or vector wich helps you to solve problems.
Examples
How datas are using in semi-supervised learning?- This question answers why does this model works properly. First we use unsupervised learning algorithm to cluster the similar data then use the supervised learning to the existing data to label the rest of the unlabeled data.
Which model is better among these three types of learning?- We select the type of model according to the situation. If we have enough labelled data then we go for supervised learning and semi-supervised learning is used when we have less labelled data and more unlabelled data. Usually supervised learning is better than other two and semi supervised is better than unsupervised.
Supervised\(>\)Semi-supervised\(>\)Un-supervised
There is a point to be noted. The above mentioned is true in general but if we have suppose 400 labelled data for supervised and 300 labelled and 5000 unlabelled data for semi-supervised, then it is not easy to say which model works better.
Examples - Game playing, Robotics, Chess