Safetensors
qwen2

Light-R1-7B-DS: SOTA 7B Math Model with only 3K Data

Model Trained From Release Date AIME24 AIME25 GPQA
OpenThinker-7B Qwen2.5-7B-Instruct 25.2.12 31.3 N/A 42.4
DeepSeek-R1-Distill-Qwen-7B Qwen2.5-Math-7B 25.1.20 55.5 39.2 49.1
Light-R1-7B-DS (ours) 🤗 DeepSeek-R1-Distill-Qwen-7B 25.3.12 59.1 44.3 49.4
Light-R1-32B (ours) 🤗 Qwen2.5-32B-Instruct 25.3.4 76.6 64.6 61.8

GitHub page

Light-R1-7B-DS is to the best of our knowledge the State-Of-The-Art open-source 7B math model with AIME24 & 25 scores 59.1 & 44.3. Light-R1-7B-DS also performed well on GPQA without any specific training.

Originated from DeepSeek-R1-Distill-Qwen-7B, Light-R1-7B-DS is further trained with only 3K SFT data as we've open-sourced, demonstrating the strong applicability of the released data.

We are excited to release this model along with the technical report.

Usage

Same as DeepSeek-R1-Distill-Qwen-7B.

Data Decontamination

We carefully evaluated data contamination of several open-sourced datasets. While certain contamination may be inevitable during pre-training, it is unacceptable for post-training to compare on benchmarks. MATH-500 is somewhat compromised with tens of questions that are identical or only numbers changed. AIME 24 and 25 stay intact but we have to pay special attention when we incorporate AIME data up to 2023.

Light-R1 did thorough decontamination with exact matching (excluding digits) and N-gram (N=32) matching.

Citation

@misc{lightr1proj,
      title={Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond}, 
      author={Liang Wen, Yunke Cai, Fenrui Xiao, Xin He, Qi An, Zhenyu Duan, Yimin Du, Junchen Liu, Lifu Tang, Xiaowei Lv, Haosheng Zou, Yongchao Deng, Shousheng Jia, Xiangzheng Zhang},
      year={2025},
      eprint={},
      archivePrefix={},
      url={https://github.com/Qihoo360/Light-R1}, 
}
Downloads last month
28
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for qihoo360/Light-R1-7B-DS

Finetuned
(71)
this model
Quantizations
2 models

Collection including qihoo360/Light-R1-7B-DS