Alina Lozovskaya's picture

Alina Lozovskaya PRO

alozowski

AI & ML interests

NLP in all aspects

Recent Activity

updated a dataset about 6 hours ago
open-llm-leaderboard/requests
updated a dataset about 6 hours ago
open-llm-leaderboard/requests
View all activity

Organizations

HF中国镜像站's profile picture Evaluation datasets's profile picture Hugging Test Lab's profile picture Technology Innovation Institute's profile picture HF中国镜像站 H4's profile picture InternLM's profile picture HF中国镜像站 TB Research's profile picture Open LLM Leaderboard's profile picture Qwen's profile picture gg-hf's profile picture IBM Granite's profile picture Social Post Explorers's profile picture HuggingFaceEval's profile picture nltpt's profile picture open-llm-leaderboard-react's profile picture Prompt Leaderboard's profile picture wut?'s profile picture Your Bench's profile picture gg-hf-g's profile picture OpenEvals's profile picture

Posts 2

view post
Post
2831
Do I need to make it a tradition to post here every Friday? Well, here we are again!

This week, I'm happy to share that we have two official Mistral models on the Leaderboard! 🔥 You can check them out: mistralai/Mixtral-8x22B-Instruct-v0.1 and mistralai/Mixtral-8x22B-v0.1

The most exciting thing here? mistralai/Mixtral-8x22B-Instruct-v0.1 model got a first place among pretrained models with an impressive average score of 79.15!🥇 Not far behind is the Mixtral-8x22B-v0.1, achieving second place with an average score of 74.47! Well done, Mistral AI! 👏

Check out my screenshot here or explore it yourself at the https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

The second news is that CohereForAI/c4ai-command-r-plus model in 4-bit quantization got a great average score of 70.08. Cool stuff, Cohere! 😎 (and I also have the screenshot for this, don't miss it)

The last news, which might seem small but is still significant, the Leaderboard frontpage now supports Python 3.12.1. This means we're on our way to speed up the Leaderboard's performance! 🚀

If you have any comments or suggestions, feel free to also tag me on X (Twitter), I'll try to help – [at]ailozovskaya

Have a nice weekend! ✨

Articles 2

Article
27

Fixing Open LLM Leaderboard with Math-Verify

models

None public yet

datasets

None public yet