IIC
/

gonzalo-santamaria-iic commited on
Commit
b052992
·
verified ·
1 Parent(s): 91f7ce1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -1
README.md CHANGED
@@ -126,6 +126,8 @@ generated_ids = [
126
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
127
  ```
128
 
 
 
129
  ## Training Details
130
 
131
  ### Training Data
@@ -139,7 +141,45 @@ We use the [Transformer Reinforcement Learning](https://huggingface.co/docs/trl/
139
 
140
  #### Training Hyperparameters
141
 
142
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
143
 
144
  #### Speeds, Sizes, Times [optional]
145
 
 
126
  response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
127
  ```
128
 
129
+ For a better experience, we recommend using [the following default generation parameters](https://huggingface.co/IIC/RigoChat-7b-v2/blob/main/generation_config.json).
130
+
131
  ## Training Details
132
 
133
  ### Training Data
 
141
 
142
  #### Training Hyperparameters
143
 
144
+ ```shell
145
+ LORA_CONFIG = {
146
+ "r": 64,
147
+ "lora_alpha": 16,
148
+ "lora_dropout": 0.1,
149
+ "bias": "none",
150
+ "task_type": "CAUSAL_LM",
151
+ "target_modules": [
152
+ "q_proj",
153
+ "k_proj",
154
+ "v_proj",
155
+ "o_proj",
156
+ "up_proj",
157
+ "gate_proj",
158
+ "down_proj",
159
+ ],
160
+ "use_rslora": True,
161
+ }
162
+
163
+ DPO_CONFIG = {
164
+ "num_train_epochs": 2,
165
+ "logging_steps": 25,
166
+ "eval_steps": 500,
167
+ "save_steps": 100,
168
+ "save_total_limit": 5,
169
+ "per_device_train_batch_size": 1,
170
+ "per_device_eval_batch_size": 1,
171
+ "gradient_accumulation_steps": 16,
172
+ "learning_rate": 5e-6,
173
+ "max_length": 8192,
174
+ "max_prompt_length": 6656,
175
+ "gradient_checkpointing": True,
176
+ "weight_decay": 0.001,
177
+ "optim": "rmsprop",
178
+ "evaluation_strategy": "steps",
179
+ "lr_scheduler_type": "cosine",
180
+ "bf16": True,
181
+ }
182
+ ```
183
 
184
  #### Speeds, Sizes, Times [optional]
185