Birchlabs/llama-13b-stepwise-embeddings

Fine-tuned input (embed_tokens: Embedding) and output (lm_head: Linear) embeddings layers, for use with Birchlabs/llama-13b-stepwise-adapter.

Prior to finetuning: we grew the vocabulary of the tokenizer and embeddings layers. The new embeddings were average-initialized, and needed training, so we trained them. These are the weights from that training.

Ordinarily a QLoRA finetune of an LLM would not finetune the embed_tokens: Embedding (you'd need to get a bit creative, because not only have the dimensions changed, but also I don't believe any way has been established to train adapters over Embeddings).
Nor apparently would it finetune lm_head: Linear. This is harder than it sounds (i.e. you can't handle it the same way you adapt the other Linear layers), because the dimensions have grown.