34 17

Quazimoto PRO

Quazim0t0

https://discord.gg/eeDfVF4FNe

SeanceTable

AI & ML interests

the hunchback of huggingface 🔙 joined: 1-20-2025 🦥unsloth user 4️⃣ Phi User 🔨 ai hobbyist 📫 On Leaderboards Top 100-200

Recent Activity

posted an update about 14 hours ago

Thank you to the Open LLM Leaderboard's team for offering it to the community for as long as they did. I only recently joined HF, and it provided a lot of incentive and information to make better models. Always will remember getting to #112 :D Anyone have a solid way to test my models privately? Please let me know!

new activity about 14 hours ago

open-llm-leaderboard/open_llm_leaderboard:It's been a wild ride, folks :) (end of the Open LLM Leaderboard)

updated a model about 21 hours ago

Quazim0t0/Geedorah-14B

View all activity

Organizations

Quazim0t0's activity

posted an update about 14 hours ago

Post

233

Thank you to the Open LLM Leaderboard's team for offering it to the community for as long as they did. I only recently joined HF, and it provided a lot of incentive and information to make better models.

Always will remember getting to #112 :D

Anyone have a solid way to test my models privately? Please let me know!

New activity in open-llm-leaderboard/open_llm_leaderboard about 14 hours ago

It's been a wild ride, folks :) (end of the Open LLM Leaderboard)

#1135 opened about 14 hours ago by

clefourrier

updated 4 models about 21 hours ago

updated a model 1 day ago

Quazim0t0/Mouse-9B

Updated 1 day ago • 28

published a model 1 day ago

Quazim0t0/Mouse-9B

Updated 1 day ago • 28

updated a model 1 day ago

Quazim0t0/Sake-20b

Updated 1 day ago • 16

published a model 1 day ago

Quazim0t0/Sake-20b

Updated 1 day ago • 16

updated a model 1 day ago

Quazim0t0/Imagine-v0.5-16bit

Text Generation • Updated 1 day ago • 26

reacted to onekq's post with 👍 2 days ago

Post

716

A bigger and harder pain point for reasoning model is to switch modes.

We now have powerful models capable of either system I thinking or system II thinking, but not both, much less switching between the two. But humans can do this quite easily.

ChatGPT and others push the burden to users to switch between models. I guess this is the best we have now.

2 replies

reacted to AdinaY's post with 🔥 2 days ago

Post

1223

R1-Omni🔥RLVR-Powered Multimodal LLM released by Alibaba

Model: StarJiaxing/R1-Omni-0.5B
Paper: R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning (2503.05379)

✨0.5B with Apache2.0
✨ Improve emotion recognition with visual and audio cues

1 reply

reacted to thomwolf's post with 🚀 2 days ago

Post

1780

We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

reacted to Lunzima's post with 🚀 2 days ago

Post

1101

I'm currently experimenting with the SFT dataset Lunzima/alpaca_like_dataset to further boost the performance of NQLSG-Qwen2.5-14B-MegaFusion-v9.x. This includes data sourced from DeepSeek-R1 or other cleaned results (excluding CoTs). Additionally, datasets that could potentially enhance the model's performance in math and programming/code, as well as those dedicated to specific uses like Swahili, are part of the mix.
@sometimesanotion @sthenno @wanlige

1 reply

published a model 2 days ago

Quazim0t0/caramel-14B

Updated about 21 hours ago • 23

published 3 models 3 days ago

Quazim0t0/mocha-14B

Updated about 21 hours ago • 7

Quazim0t0/Lineage-14B

Updated about 21 hours ago • 20

Quazim0t0/Geedorah-14B

Updated about 21 hours ago • 20

reacted to awacke1's post with 🚀 3 days ago

Post

1983

I introduce MIT license

ML Model Specialize Fine Tuner app "SFT Tiny Titans" 🚀

Demo video with source.

Download, train, SFT, and test your models, easy as 1-2-3!
URL: awacke1/TorchTransformers-NLP-CV-SFT

2 replies