Classification metric: Accuracy

Classification metric: Accuracy

ยท

4 min read

Are you seeking a resource that can provide you with all the information you require regarding accuracy as a classification metric? If so, allow me to assure you that you are in the correct spot because, after reading this piece, you will fully understand what accuracy is, why we use it, and what the drawbacks are. So let's get started without further ado.

Note: You must be fully aware of the necessity for performance measurements before you can fully understand accuracy as a classification metric.

Why do we need performance metrics?

We need metrics to evaluate the overall correctness of our machine learning model, and regarding its types, there are two types of performance metrics: regression metrics and classification metrics.

Regression metricsClassification metrics
Regression metrics are the parameters that allow us to assess how well a machine learning model performs when solving a regression problem.Classification metrics are the parameters that allow us to assess how well a machine learning model performs when solving a classification problem.
Example: Mean absolute error, Mean squared error and root mean squared error etcExample: Accuracy, confusion matrix, precision, recall and F-score

What is accuracy?

Accuracy is one of the classification metrics that give us the percentage of the correctness of our machine-learning model. Mathematically, it is represented as:

$$Accuracy = [(Correct-predictions)/(Total-predictions)]*100$$

How much accuracy is good?

One of the most crucial questions that frequently come up in interviews is this one. The answer to this question is "accuracy fully depends on the problem statement," which is the right response. To better grasp this, let's talk about two instances where, in one, 95% accuracy is deemed acceptable, but not in the other.

  • 95% accuracy is not acceptable๐Ÿ’Š

    Let us assume that we have made a classification-based machine learning model that will help us to find out whether a person is having a brain tumor or not based on the clinical images we will feed into the model.

    Let's say that we got 95% accuracy but even after getting this much accuracy, it is of no use in the real world because 95% accuracy means that out of 100 people, the chances are that 5 people will be there having brain tumor but our model will not be able to predict it, now since in this case the stakes are very high thus 95% accuracy will not be considered as good accuracy.

  • 95% accuracy is goodโ›ˆ๏ธ

    Let us assume that we have made a machine learning model that on basis of some parameters such as temperature, humidity and wind speed will give us the predicted date on which rain could happen, in this case even though instead of 95% if we got 80% accuracy that will be considered as good because the stakes are not as high as in the 1st example.

The major drawback of accuracy?

The major drawback of accuracy is that even though it gives us the numerical value representing the correctness of the model, accuracy fails to tell us the type of error that is being made by our machine learning model, like whether the error is of type 1 or type 2.

To solve this problem we use a confusion matrix using which we not only can find the accuracy of our ML model but we can also figure out what type of error our ML model is making.

When accuracy can be misleading

Accuracy can be misleading when the dataset is skewed (i.e., when some classes are much more frequent than others). For example, consider a dataset where 99% of the examples belong to one class and 1% belong to the other. A model that simply predicts the majority class all the time would achieve an accuracy of 99%, but this would not be very useful. In this case, metrics like precision, recall, and F1 score, which takes into account the balance of different classes in the dataset, would be more informative.

That's all for now, and I hope this blog provided some useful information for you. Additionally, don't forget to check out my ๐Ÿ‘‰ TWITTER handle if you want to receive daily content relating to data science, mathematics for machine learning, Python, and SQL in form of threads.

Did you find this article valuable?

Support Yuvraj Singh by becoming a sponsor. Any amount is appreciated!

ย