Reconstructing pre-training data

#1
by pietrolesci - opened

Hi there,

I would like to use Bloom in my research. I was wondering whether there is a way to reconstruct the tokenised pre-training data in the order seen by the model.

Thanks,
Pietro

Sign up or log in to comment