Training DBRX-like model

#13
by nguyenthanhdo - opened

Hi, i've seen the team mentioned that the code used for training DBRX are optimized versions of Composer, LLM Foundry, MegaBlocks and Streaming but I found it quite challenging to navigate. I want to pretrain a MOE model (DBRX-like architecture), could you please guide me how would I do it with the opensourced versions of those mentioned libs?

This comment has been hidden
This comment has been hidden
Databricks org
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment