Development Set vs Test Set

202502012206
tags: #machine-learning #data-splitting #evaluation

Development (validation) set is used for model selection and hyperparameter tuning, while test set provides final unbiased performance evaluation.

Development Set:

Test Set:

Key principle: Your test set should reflect the data distribution you expect in production. If dev and test sets come from different distributions, you may have Data Distribution Mismatch.

Using a Single Number Evaluation Metric on both sets helps make clear comparisons between models.

Never use test set performance to make model decisions - this leads to overfitting to the test set.


Reference

Machine Learning Yearning by Andrew Ng