Awan LLM commited on
Commit
1c0edea
·
verified ·
1 Parent(s): 0957687

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -3
README.md CHANGED
@@ -1,3 +1,34 @@
1
- ---
2
- license: llama3
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3
3
+ ---
4
+ Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement:
5
+ https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct
6
+
7
+
8
+ Realized a tokenization mistake with the previous DPO model. So this is now a new version testing out DPO training on the following dataset:
9
+
10
+ - https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k
11
+
12
+
13
+ We are happy for anyone to try it out and give some feedback and we will have the model up on https://awanllm.com on our LLM API if it is popular.
14
+
15
+
16
+ Instruct format:
17
+ ```
18
+ <|begin_of_text|><|start_header_id|>system<|end_header_id|>
19
+
20
+ {{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
21
+
22
+ {{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
23
+
24
+ {{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
25
+
26
+ {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
27
+ ```
28
+
29
+
30
+ Quants:
31
+
32
+ FP16: https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Instruct-DPO-v0.2
33
+
34
+ GGUF: https://huggingface.co/AwanLLM/Awanllm-Llama-3-8B-Instruct-DPO-v0.2-GGUF