What matters when building vision-language models? Paper β’ 2405.02246 β’ Published May 3, 2024 β’ 102