Learning Curves
202502012207
tags: #machine-learning #diagnostics #model-performance
Learning curves plot training and validation error as a function of training set size, helping diagnose Bias vs Variance problems.
High Bias pattern:
- Training and validation errors converge to a high value
- Both errors plateau early
- Adding more data won't help much
- Solution: More complex model, Feature Engineering
High Variance pattern:
- Large gap between training and validation error
- Training error is low, validation error is high
- Gap may decrease with more data
- Solution: More data, Regularization, simpler model
Healthy model:
- Training and validation errors converge to low value
- Small gap between the two curves
- Performance improves with more data
Learning curves help decide whether to collect more data or change the model architecture. They're more informative than single-point accuracy measurements.
Use learning curves alongside Human Level Performance benchmarks to set realistic expectations.
Reference
Machine Learning Yearning by Andrew Ng