Home
Blog
Paper Reviews
Publications
Projects
Contact
Light
Dark
Automatic
JHU
Scaling Laws for Neural Language Models
A large-scale empirical invesigation of scaling laws shows that performance has a power-law relationship to model size, dataset size, and training compute, while architectural details have minimal effects.
Cite
×