I hope you are doing great, today we will discuss an important topic for model evaluation that is "Precision". Before starting out let me tell you that after reading this blog you will be completely aware about what is precision, why we even use it and how to calculate it for binary or multi-class classification. So let's get started.
Why to even consider precision ?
Precision is an important evaluation metrics used to assess the performance of machine learning models solving classification problems. In a classification problem, the model predicts the output class for a given input data point.
Before diving into the calculation of precision, it is essential to understand why we use this metric instead of relying solely on accuracy. Are there any drawbacks to using accuracy exclusively?
The answer is yes. Accuracy may not be a reliable evaluation metric when dealing with imbalanced data. To illustrate this, let's consider the example of building a model to detect genuine or fake job postings. This scenario involves binary classification, where it is expected that the data distribution among the labels will be unequal. Specifically, the number of fake job postings will generally be lower than that of genuine postings.
So out of total let say if 95 job postings are genuine and 5 postings are fake then by using accuracy as evaluation metric we will get a conclusion that our model is extremely accurate since it correctly classifies the majority of job postings as genuine. However, this conclusion would be misleading because the model's performance on detecting fake job postings, which is a critical task, remains unknown.
Precision for binary class classification
The formal definition of precision is the total number of true positives in the predicted positives. In short precision gives us the accuracy of positive label.
Now to better understand the formula of precision rather than just memorizing it you need to be aware about the confusion matrix. Confusion matrix is another evaluation metric used to evaluate the performance of model solving classificaton problem and this metric got introduced just to cover the major drawback of accuracy as an metric, which was that accuracy didn't gave us any information about the type of mistakes our model is making ( Type 1 or Type 2).
So moving back to the precision, let us assume that we are solving a binary classification problem of predicting cat of dog based on image as an input. In the confusion matrix on the x-axis we will be having predicted labels and on the y-axis we will be having Actual labels.
Let's again recall the precision defintion "Precision is the total number of true positives in the predicted positives". Since this is a binary classification problem so we will only focus on the positive label that is Spam label.
From the confusion matrix we can clearly see that in the predicted positives ( TP = 45 + FP = 5 ) there are only 45 true positives ( TP ). So the precision in this case would simply be
Precision for multi-class classification
In case of binary class classification our focus in mostly on the positive class like spam, fraud etc but not on the negative class such as not spam or not fraud.But in cae of multi-class classification where we have multiple classes we will not be able to use this cocept of ony choosing the positive class, because there would be no notion of positive or negative class.
To better undertand this let us assume that we are building a classification model where based on certain data input our model will assign a class to that query point (Input data). The classes are ( Apple, orange and mango ). So in such kind of scenarios we will not be having any notion of positive or negative labels thus we will calculating the accuracy of every class.
In case of multi-class classification we will calculate the precision for every class but we will choose the one which is having maximum value out of all the classes.
Short note
I hope you good understanding of what is precision, how to calculate it for binary or multi-class classification problems and if you liked this blog or have any suggestion kindly leave a comment below it would mean a to me.