202511171430 Status: idea Tags: Datascience, Machine Learning, Model Evaluation Metrics
Regression Metrics
Regression metrics are essential tools used to evaluate the performance of a regression model. Regression models are statistical models that predict a continuous outcome (a real number) based on one or more input variables.
These metrics measure the difference between the predicted values () and the actual observed values (), giving you an idea of how well the model is fitting the data and how accurate its predictions are. The image you provided lists three of the most common regression metrics: Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and the Coefficient of Determination ().
Common Regression Metrics
Mean Squared Error (MSE)
The Mean Squared Error (MSE) is the average of the squared differences between the predicted and actual values.
- : The actual observed value.
- : The predicted value from the model.
- : The total number of data points.
Key features:
- It penalizes large errors more heavily than small errors because the differences are squared.
- The resulting error unit is squared, which can make it difficult to interpret in the context of the original target variable.
- A lower MSE indicates a better model fit.
Root Mean Squared Error (RMSE)
The Root Mean Squared Error (RMSE) is the square root of the MSE.
Key features:
- It brings the error unit back to the same units as the target variable, making it more interpretable than the MSE.
- Like MSE, a lower RMSE indicates a better model fit.
- It is often the most preferred metric for evaluating regression models because of its interpretability.
Coefficient of Determination ()
The Coefficient of Determination () is a measure that indicates the proportion of the variance in the dependent variable that is predictable from the independent variables.
- : The mean of the actual observed values.
- The numerator is the unexplained variance (sum of squared errors, similar to MSE’s numerator).
- The denominator is the total variance of the data (variance if the model was just the mean).
Key features:
- The value ranges from 0 to 1.
- An of 1 means the model perfectly predicts the target variable’s variance (perfect fit).
- An of 0 means the model explains none of the variability of the response data around its mean.
- It’s a relative measure and is often used to compare models.
References
Dit is iets wat we leren voor Datascience. dit was informatie vanuit avans 2-2 datascience 2025-11-10. en daarbij horen deze slides