RegMix: Data Mixture as Regression for Language Model Pre-training Paper β’ 2407.01492 β’ Published Jul 1, 2024 β’ 37