Photo by Markus Spiske on Unsplash
Types of Classification in Machine Learning: A Guide for Beginners
Hello machine learning enthusiasts , So today we will dive deep into classification in machine leanring , and here are the things which we will discuss in this blog post
What is classification in Machine Learning ?
What are the types of classification ?
Common terminologies we use while doing the classification in ML
Algorithms used to do classification So after this blog post I can assure you that you will be equipped with a solid knowledge related to classification in Machine learning , so without any due let's start with our very first question which must come in our mind .
What is classification in Machine learning?
First of all before going straight to the general definition of classification in machine learning let me give you a short recap of type of machine learning using which we can do classification .
Overall classification is basically a type of problem which we can solve using supervised machine learning, which is simply a type of machine learning which is used in case the data which we have is labeled data, which means for every feature or group of features there is some particular output called as a label attached to it. Now let's get straight to the general definition that you could write in your exams
Now since you know what is classification in machine learning, let us now take a look at some types of machine learning?
Types of Classification in Machine Learning
Overall there are 4 types of classification in machine learning and all these types first 3 types of classification are mostly used but you must also be aware about the 4th type of classification
Binary Classification
Multi-Class Classification
Multi-Label Classification
Imbalanced Classification
Now we will discuss each type of classification, and in the discussion, you will learn about how all of these types are different visually when you will plot the graph for the output.
Binary Classification: Binary classification is the most common type of classification in which the data which we will feed into our machine learning model in form of features, that data will categorize into 2 categories as an output.
For example: In my 2nd semester of university we made a machine learning model that could make a prediction that whether a person is prone to heart diseases or not based on some of the features like cholesterol, blood pressure etc , now in this case there could be only 2 output related to this type of problem and those would be ( Prone to diseases / Not prone to diseases ) . Now since there are only 2 categories in which our machine learning model could categorise the prediction generated by it , thus this is a problem of binary classification .
Are there any supervised machine learning algorithms that could be used in order to do binary classification ?
The answer to this question in YES ! , there are some popular supervised machine learning that you could use to make your next machine learning project which involves binary classification . Overall there are mainly 5 algorithms which are most commonly used
Logistic Regression
k-Nearest Neighbors
Support Vector Machine
Decision Trees
Naive Bayes
Out of all the above-mentioned algorithms , there are few algorithms that are specifically designed for binary classification only, and these algorithms are Logistic Regression and Support Vector Machines.
Multi-Class Classification: Multi-Class classification in machine learning is basically a type of classification in which the output ( prediction ) that will be generated by the machine learning model can be categorized into more than 2 classes. Now carefully understand the example below .
Let us assume that we want to make a machine learning model that will tell us what is name of fruit which we will provide as an image or sample in our machine learning model. Now since we all know that there are not only 2-3 types of fruits instead there are multiple types of fruits , thus multiple fruits having same features can belong to multiple classes , the supporting illustration is given below
Here in the above table as you can see that for same features like ( Round and small ) there can be 2 types of fruits ( Apple or lemon ) , in the same way for a feature like the roundness there are 4 types of fruit thus this is a problem of multi-class classification.
Just like the binary classification, there are also some algorithms that you can implement in order to make a machine learning model in which the problem statement> t is related to multi-class classification.
Random Forest.
k-Nearest Neighbors
Support Vector Machine
Decision Trees
Gradient Boosting
Wait a minute, previously we said that some of the algorithms which we use for binary classification are designed specifically to do binary classification only, so why we are using algorithms like: Support Vector Machines ?!
The answer to this query is that there are some algorithms used in binary classification that can not be directly used for doing the multi-class classification but can be modified to do so. And there are only 2 such types: ( Logistic Regression and Support vector machine )
But how can we do it, like how this is possible?
The answer to this query is the binary classification algorithms can be used for multi-class classfication problem by using 2 different type of strategies named as :
One vs Rest
One vs One
Both of these 2 strategies will be discussed in the next blog so right now , let's jump straight into the remaining 2 types .
Multi-Label Classification: Multi-label classification is basically a more advanced version of the multi-class classification, because in this case sample data that we will provide to our machine learning model, that sample data will contain the data belonging to more than 2 labels. Let us take a look at the example below to get things more understandable
For example, We are making a machine learning model that will predict the name of fruit based on some features, but the thing is that the sample data also known as unseen data which we will be providing to our machine learning model as an output will contain fruits but in the refrigerator, now we all know that in the refrigerator there can be vegetables, milk, water bottles, etc and all of these things belong to different labels, so in this case, the classification for our fruits will belong to multi-label classification.
Just like other classification , which algorithms do we use for doing the multi-label classification ?
Basically the classification algorithms which we used for doing the binary classification or multi-class classification , we can use them but with we will use specialised version of those algorithms , which are mentioned below
Multi-label Decision Trees
Multi-label Gradient Boosting
Multi-label Random Forests
Imbalanced Classification : Imbalanced classification is just that type of classification in which we have imbalanced data samples for each class.
For example : Let us assume that we want to make a machine learning to detect spam emails , but the data set which we have is having imbalanced number of sample for classes ( SPAM / NO SPAM ) , here by imbalanced samples I mean that the total number of SPAM samples are way more than the NO SPAM samples .
Thus these type of problems can be solved using binary classification but with some additional techniques that would change the composition of samples in the training dataset by undersampling the majority class or oversampling the minority class . The names of these techniques are :
SMOTE Oversampling.
Random Undersampling.
Also inorder to pay more attention to the minority class when fitting the model on the training dataset, we can use some cost-sensitive machine learning algorithms some of which are
Cost-sensitive Support Vector Machines.
Cost-sensitive Decision Trees.
Cost-sensitive Logistic Regression.
Short note
I hope you good understanding of what are decision trees, what is classification in machine learning and what are the types of classification. So if you liked this blog or have any suggestions kindly like this blog or leave a comment below it would mean a to me.
Also I would love to connect with you, so here is my Twitterand linkedin