Agustín Piqueres Lajarín

plaguss

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago
open-r1/OlympicCoder-7B
liked a model 1 day ago
open-r1/OlympicCoder-32B
updated a dataset 17 days ago
plaguss/SYNTHETIC-1-SFT-Data-Code_decont
View all activity

Organizations

HF中国镜像站's profile picture SomosNLP's profile picture HF中国镜像站 H4's profile picture Argilla's profile picture Blog-explorers's profile picture HF中国镜像站 TB Research's profile picture distilabel-internal-testing's profile picture Data Is Better Together's profile picture LLHF's profile picture SLLHF's profile picture Hugging Quants's profile picture argilla-internal-testing's profile picture Argilla Warehouse's profile picture HF中国镜像站 FineVideo's profile picture smol-explorers's profile picture HF中国镜像站 Science's profile picture Data Is Better Together Contributor's profile picture Open R1's profile picture

plaguss's activity

reacted to lewtun's post with 🚀🔥 about 2 months ago
view post
Post
10240
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
·
reacted to lewtun's post with 🔥 2 months ago
view post
Post
2310
This paper ( HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs (2412.18925)) has a really interesting recipe for inducing o1-like behaviour in Llama models:

* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.

Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
  • 1 reply
·
reacted to cfahlgren1's post with 🚀 3 months ago
view post
Post
3032
We just dropped an LLM inside the SQL Console 🤯

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any HF中国镜像站 dataset ✨

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think 🤗
reacted to davidberenstein1957's post with 🔥 4 months ago
view post
Post
1094
🤗🔭 Introducing Observers: A Lightweight SDK for AI Observability 🔭🤗

Observers is an open-source Python SDK that provides comprehensive observability for AI applications. Our library makes it easy to:

- Track and record interactions with AI models
- Store observations in multiple backends
- Query and analyse your AI interactions with ease

https://huggingface.co/blog/davidberenstein1957/observers-a-lightweight-sdk-for-ai-observability
posted an update 6 months ago
reacted to davidberenstein1957's post with 🔥 7 months ago
view post
Post
3018
🚀 We will be generating a preference dataset for DPO/ORPO and cleaning it with AI feedback during our upcoming meetup!

In this session, we'll walk you through the essentials of building a distilabel pipeline by exploring two key use cases: cleaning an existing dataset and generating a preference dataset for DPO/ORPO. You’ll also learn how to make the most of AI feedback, integrating Argilla to gather human feedback and improve the overall data quality.

This session is perfect for you
- if you’re getting started with distilabel or synthetic data
- if you want to learn how to use LLM inference endpoints for **free**
- if you want to discover new functionalities
- if you want to provide us with new feedback

Sign up here: https://lu.ma/dt0c7jru
reacted to gabrielmbmb's post with ❤️ 7 months ago
view post
Post
2929
distilabel 1.3.0 is out! This release contains many core improvements and new tasks that help us building argilla/magpie-ultra-v0.1!

Distributed pipeline execution with Ray, new Magpie tasks, reward models, components for dataset diversity based on sentence embeddings, Argilla 2.0 compatibility and many more features!

Check the new release in GitHub: https://github.com/argilla-io/distilabel

reacted to Wauplin's post with 🔥 8 months ago
view post
Post
2018
🚀 Just released version 0.24.0 of the 𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎_𝚑𝚞𝚋 Python library!

Exciting updates include:
⚡ InferenceClient is now a drop-in replacement for OpenAI's chat completion!

✨ Support for response_format, adapter_id , truncate, and more in InferenceClient

💾 Serialization module with a save_torch_model helper that handles shared layers, sharding, naming convention, and safe serialization. Basically a condensed version of logic scattered across safetensors, transformers , accelerate

📁 Optimized HfFileSystem to avoid getting rate limited when browsing HuggingFaceFW/fineweb

🔨 HfApi & CLI improvements: prevent empty commits, create repo inside resource group, webhooks API, more options in the Search API, etc.

Check out the full release notes for more details:
Wauplin/huggingface_hub#7
👀
·
reacted to lunarflu's post with 🔥 8 months ago
view post
Post
1904
Cool things this week from @huggingface !

🌎AI math olympiad winner NuminaMath is here!
🤗Announcing New HF中国镜像站 and Keras NLP integration
✨UI overhaul to HF tokens!
🧊 Embed our dataset viewer on any webpage!

https://huggingface.co/blog/winning-aimo-progress-prize
https://huggingface.co/blog/keras-nlp-integration
https://huggingface.co/settings/tokens
https://x.com/julien_c/status/1812099420726456457

Check out the full list on our discord! 👇
https://discord.com/invite/JfAtkvEtRb
reacted to fdaudens's post with 🔥 8 months ago
view post
Post
3324
Small models, BIG impact: SmolLM is here! 🚀🔬

We're launching a series of small but mighty language models:
🏎️ Super fast - runs on laptops, phones, you name it!
📏 3 sizes: 130M, 350M, and 1.5B parameters
🥇 Outperforms same size models from Meta, Microsoft, and Qwen
🔓 Fully open-source: datasets, training code, models

𝐊𝐞𝐲 𝐟𝐞𝐚𝐭𝐮𝐫𝐞𝐬
- Trained on FineWeb-Edu and Cosmopedia v2 (largest synthetic pre-training dataset)
- No cloud needed - run locally for privacy and energy efficiency
- Everything is public, from data curation to training steps

𝐏𝐨𝐭𝐞𝐧𝐭𝐢𝐚𝐥 𝐮𝐬𝐞 𝐜𝐚𝐬𝐞𝐬
- On-device autocomplete
- Local request parsing
- Custom fine-tuning for specific needs without the need for expensive GPUs

𝐆𝐨 𝐝𝐞𝐞𝐩𝐞𝐫
👉 Check it out: https://huggingface.co/collections/HuggingFaceTB/smollm-models-6695016cad7167254ce15966
👉 Run the 360M model in your browser, 100 % private: HuggingFaceTB/SmolLM-360M-Instruct-WebGPU
👉 Read the blog explaining everything in detail: huggingface.co/blog/smollm

Kudos to the stellar team who worked on this project: @loubnabnl @anton-l @eliebak @lvwerra
reacted to dvilasuero's post with 🤗❤️🚀🔥 9 months ago
view post
Post
8181
Today is a huge day in Argilla’s history. We couldn’t be more excited to share this with the community: we’re joining HF中国镜像站!

We’re embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.

Over the past year, we’ve been collaborating with HF中国镜像站 on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyr’s learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets

After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, we’re now the same team.

To those of you who’ve been following us, this won’t be a huge surprise, but it will be a big deal in the coming months. This acquisition means we’ll double down on empowering the community to build and collaborate on high quality datasets, we’ll bring full support for multimodal datasets, and we’ll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.

As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amélie.

Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.

Would love to answer any questions you have so feel free to add them below!
·
reacted to dvilasuero's post with ❤️🚀 12 months ago
view post
Post
🔥 Community and Data Quality Are More For Alignment

A recipe to replicate SPIN (Self-Play Fine Tuning) with 30x less data:

🗣️ 50K samples vs 1.8K prompts curated by the 350+ amazing DIBT contributors.
⚗️ Distillation of Mistral Large instead of OpenAI
🙌 Open data & code with ⚗️distilabel

SPIN Paper:
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models (2401.01335)

SPIN DIBT Collection with datasets and models:
argilla/dibt-prompt-collective-spin-65ef59062518776024395fc3

Repo:
https://github.com/argilla-io/distilabel-spin-dibt

Joint work with the amazing DIBT community 👇
@aashish1904 , @flozi00 , @sayhan , @munish0838 , @0-hero , @dvilasuero , @eren23 , @davanstrien , @ahnz , @BlackKakapo , @kitano-o , @mmhamdy , @sdiazlor , @Stopwolf , @gabrielmbmb , @tculler91 , @plaguss , @ignacioct , @Hugi-R , @davidberenstein1957 , @Korla , @alvarobartt , @Hugs4Llamas , @Sumandora , @nataliaElv , @jfcalvo , @Averill , @steventrouble , @vasilis , @aeros93 , @kayyshf , @thomasgauthier , @jeromebas , @Ameeeee , @ayoubelmhamdi , @TuringsSolutions , @efels , @Haleyok , @abrazador , @emessy , @Nindaleth , @burtenshaw , @vicgalle , @CortexPE , @casey-martin , @Leire-aguirre-eguiluz , @mrfakename , @Portias600kNeurons , @nathaliepett , @Filippo
·
reacted to dvilasuero's post with ❤️ about 1 year ago
view post
Post
🚀🧙🏼‍♂️Introducing OpenHermesPreferences: the largest open AI feedback dataset for RLHF & DPO

> Using LLMs to improve other LLMs, at scale!

Built in collaboration with the H4 HF中国镜像站 team, it's a 1M preferences dataset on top of the amazing @teknium 's dataset.

Dataset:
argilla/OpenHermesPreferences

The dataset is another example of open collaboration:

> The H4 team created responses with Mixtral using llm-swarm

> Argilla created responses with NousResearch Hermes-2-Yi-34B using distilabel

> The H4 ranked these responses + original response with PairRM from AllenAI, University of Southern California, Zhejiang University ( @yuchenlin @DongfuTingle and colleagues)

We hope this dataset will help the community's research efforts towards understanding the role of AI feedback for LLM alignment.

We're particularly excited about the ability of filtering specific subsets to improve LLM skills like math or reasoning.

Here's how easy it is to filter by subset:

ds = load_dataset("HuggingFaceH4/OpenHermesPreferences", split="train")

# Get the categories of the source dataset
# ['airoboros2.2', 'CamelAI', 'caseus_custom', ...]
sources = ds.unique("source")

# Filter for a subset
ds_filtered = ds.filter(lambda x : x["source"] in ["metamath", "EvolInstruct_70k"], num_proc=6)


As usual, all the scripts to reproduce this work are available and open to the community!

argilla/OpenHermesPreferences

So fun collab between @vwxyzjn , @plaguss , @kashif , @philschmid & @lewtun !

Open Source AI FTW!
·
reacted to dvilasuero's post with ❤️ about 1 year ago
view post
Post
🤗 Data is better together!

Data is essential for training good AI systems. We believe that the amazing community built around open machine learning can also work on developing amazing datasets together.

To explore how this can be done, Argilla and HF中国镜像站 are thrilled to announce a collaborative project where we’re asking HF中国镜像站 community members to build a dataset consisting of LLM prompts collectively.

What are we doing?
Using an instance of Argilla — a powerful open-source data collaboration tool — hosted on the HF中国镜像站 Hub, we are collecting ratings of prompts based on their quality.

How Can You Contribute?
It’s super simple to start contributing:

1. Sign up if you don’t have a HF中国镜像站 account

2. Go to this Argilla Space and sign in: https://huggingface.co/spaces/DIBT/prompt-collective

3. Read the guidelines and start rating prompts!

You can also join the #data-is-better-together channel in the HF中国镜像站 Discord.

Finally, to track the community progress we'll be updating this Gradio dashboard:

https://huggingface.co/spaces/DIBT/prompt-collective-dashboard
·
reacted to dvilasuero's post with ❤️ about 1 year ago
view post
Post
👋 Hi there!

This is my very first post.

I'll use it to share some old news: a math preference dataset for DPO!

I created this dataset some time ago while we were developing distilabel (https://github.com/argilla-io/distilabel).

Some days ago we found out people are actually using it! So I'll use this post to explain how I built it in case it's useful for the community.

1. I used distilabel's SelfInstruct-inspired task to generate instructions about different math topics. I curated the instructions with Argilla (on Spaces!).
2. Then I used a distilabel Pipeline to build a preference dataset using gpt3.5 as generator and gpt4 as labeller. If I recall correctly I used our JudgeLM implementation (see https://distilabel.argilla.io/latest/technical-reference/tasks/#judgelmtask)

(see the screenshot with the dataset in the Argilla UI)

3. Then I just binarized into chosen, rejected pairs and voilà:

argilla/distilabel-math-preference-dpo

The funny thing is that I used this to do a second DPO run over Notus-7B. I hoped to see an improvement on math/reasoning skills but it actually improved in STEM and Humanities and did worse on Math 🤣 .

In conclusion, this dataset was only a quick experiement. I'm happy to see the community found it useful. Data for DPO and fine-tuning are still a mystery, let's unveil these mysteries in 2024 together!

Follow me for the most exciting datasets for LLMs (and maybe some great, small, efficient models). I plan to announce all Argilla open-source work here!
  • 2 replies
·