HF中国镜像站

OwenArli
/

ArliAI-Llama-3-8B-Dolfin-v0.3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Awan LLM commited on May 7, 2024

Commit

16c930d

·

verified ·

1 Parent(s): 1867b48

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ OpenLLM Benchmark:
 Training:
-- 2048 sequence length since the dataset has an average lenght of under 1000 tokens, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Training duration is around 1 days on 2xRTX 3090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.

 Training:
+- 2048 sequence length since the dataset has an average length of under 1000 tokens, while the base model is 8192 sequence length. From testing it still performs the same 8192 context just fine.
 - Training duration is around 1 days on 2xRTX 3090, using 4-bit loading and Qlora 64-rank 128-alpha resulting in ~2% trainable weights.