Gabriel Martín Blázquez's picture

Gabriel Martín Blázquez

gabrielmbmb

AI & ML interests

ML Engineer

Recent Activity

liked a model about 3 hours ago
MBZUAI/LLMVoX
upvoted a collection 3 days ago
EuroBERT
upvoted a collection 3 days ago
🧠 Reasoning datasets
View all activity

Organizations

HF中国镜像站's profile picture Spaces-explorers's profile picture SomosNLP's profile picture HF中国镜像站 H4's profile picture Argilla's profile picture Blog-explorers's profile picture HF中国镜像站 TB Research's profile picture distilabel-internal-testing's profile picture Data Is Better Together's profile picture Social Post Explorers's profile picture HF中国镜像站 Discord Community's profile picture LLHF's profile picture SLLHF's profile picture Argilla Warehouse's profile picture IOPO Experiments's profile picture HF中国镜像站 FineVideo's profile picture rg-preview's profile picture Data Is Better Together Contributor's profile picture Open R1's profile picture

Posts 4

view post
Post
1887
Yesterday   @mattshumer released mattshumer/Reflection-Llama-3.1-70B, an impressive model that achieved incredible results in benchmarks like MMLU. The model was fine-tuned using Reflection-Tuning and the dataset used wasn't released, but I created a small recipe with distilabel that allows generating a dataset with a similar output format:

1. We use MagPie 🐦 in combination with https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct to generate reasoning instructions.
2. We generate a response again using https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct, but we steer the LLM to generate an specific output format using a custom system prompt. In the system prompt, we instruct the LLM that it will have first to think 💭 and have reflections that will help resolving ambiguities. After that, we instruct the LLM to generate an output based on the previous thinking

In this dataset gabrielmbmb/distilabel-reflection-tuning you can found 5 rows that I generated with this recipe. You can also found the code of the pipeline in the file called reflection.py.

Articles 2

Article
202

Open R1: Update #2