72 317 311

Aymeric Roucher

m-ric

http://aymeric-roucher.github.io

AI & ML interests

Leading Agents at HF中国镜像站 🤗

Recent Activity

upvoted an article 1 day ago

LeRobot goes to driving school: World’s largest open-source self-driving dataset

posted an update 3 days ago

Our new Agentic leaderboard is now live!💥 If you ever asked which LLM is best for powering agents, we've just made a leaderboard that ranks them all! Built with @albertvillanova, this ranks LLMs powering a smolagents CodeAgent on subsets of various benchmarks. ✅ 🏆 GPT-4.5 comes on top, even beating reasoning models like DeepSeek-R1 or o1. And Claude-3.7-Sonnet is a close second! The leaderboard also allows you to show the scores of vanilla LLMs (without any agentic setup) on the same benchmarks: this shows the huge improvements brought by agentic setups. 💪 (Note that results will be added manually, so the leaderboard might not always have the latest LLMs)

updated a Space 3 days ago

smolagents/smolagents-leaderboard

View all activity

Organizations

Posts 95

Post

772

Our new Agentic leaderboard is now live!💥

If you ever asked which LLM is best for powering agents, we've just made a leaderboard that ranks them all! Built with @albertvillanova , this ranks LLMs powering a smolagents CodeAgent on subsets of various benchmarks. ✅

🏆 GPT-4.5 comes on top, even beating reasoning models like DeepSeek-R1 or o1. And Claude-3.7-Sonnet is a close second!

The leaderboard also allows you to show the scores of vanilla LLMs (without any agentic setup) on the same benchmarks: this shows the huge improvements brought by agentic setups. 💪

(Note that results will be added manually, so the leaderboard might not always have the latest LLMs)

View all Posts