Loss functions vs Metrics: The subtle distinctions

2 min readDec 12, 2022

To optimally train a neural network it is crucial to select the right loss function and evaluation metrics.

Although they may seem similar at first glance, there are some differences worth knowing.

What is Loss Function?

Loss function also known as cost function, is used to optimize neural networks. Gradient descent algorithms are used to train the majority of neural networks. These algorithms all have the same goal in mind: to minimize the loss function.

The loss function distills the model’s performance into a single number. Minimizing this value simply means that the model will perform better.

Some popular loss functions are,

Mean Squared Error (MSE)
Cross Entropy Loss

Minimizing the loss function to find the local minima [Source]

What is Metric?

Metrics are an indicator of how well your model is performing. It helps us determine how well “fitted” the model is, given the predicted values and actual values.

Metrics are used to evaluate and compare models. Better the metric value, better is the model’s performance.

Some popular metrics are,

Mean Squared Error (MSE)
Confusion Matrix
Accuracy / Recall / Precision

Confusion matrix is a popular metric for classification problems [Source]

What are the differences?

Simply put, loss function is for machines, and metric is for humans.

Loss function is what the machine tries to minimize in order to optimize the machine learning model.

Metrics are utilized by people to evaluate the performance of machine learning models and has nothing to do with the optimization process.

Can they be used interchangeably?

A loss function can be used as a metric, but the opposite isn’t always true.

This is due to an important characteristic of loss functions: they must be differentiable.

Some loss functions are intuitive, minimizing them will improve the model. They are also easy to interpret (like MSE and RMSE).

We require metrics because understanding some loss functions is not really intuitive (like cross entropy loss). As a result, we use metrics like accuracy. However, metrics like accuracy are not differentiable and cannot be used to optimize the model.

Conclusion

Although there is an overlap between loss functions and metrics, there are also some subtle distinctions. Knowing these distinctions will allow us to choose appropriate loss functions and metrics, as well as train better models.