zero-gpu-explorers (ZeroGPU Explorers)

ZennyKenny

in zero-gpu-explorers/README 4 days ago

Multiple zeroGPU calls in same code

1

#155 opened 4 days ago by

hen

rizavelioglu

posted an update 11 days ago

Post

3142

Comparing reconstruction quality of various VAEs with an interactive demo
rizavelioglu/vae-comparison

1 reply

·

alielfilali01

posted an update 21 days ago

Post

801

🚨 Arabic LLM Evaluation 🚨

Few models join the ranking of inceptionai/AraGen-Leaderboard Today.

The new MistralAI model, Saba, is quite impressive, Top10 ! Well done @arthurmensch and team.

Sadly Mistral did not follow its strategy about public weights this time, we hope this changes soon and we get the model with a permissive license.

We added other Mistral models and apparently, we have been sleeping on mistralai/Mistral-Large-Instruct-2411 !

Another impressive model that joined the ranking today is ALLaM-AI/ALLaM-7B-Instruct-preview. After a long wait finally ALLaM is here and it is IMPRESSIVE given its size !

ALLaM is ranked on OALL/Open-Arabic-LLM-Leaderboard as well.

julien-c

in zero-gpu-explorers/README about 1 month ago

Update README.md

1

#152 opened about 1 month ago by

fdaudens

updated a Space about 1 month ago

94

README

🌍

fdaudens

in zero-gpu-explorers/README about 1 month ago

Update README.md

1

#152 opened about 1 month ago by

fdaudens

lixin4ever

authored a paper about 1 month ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 85

garibida

authored a paper about 2 months ago

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Paper • 2501.12224 • Published Jan 21 • 46

mlabonne

posted an update about 2 months ago

Post

6011

🆕 LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

💻 LLM Course: https://huggingface.co/blog/mlabonne/llm-course

Tonic

in zero-gpu-explorers/README about 2 months ago

Zero GPU not picking up app.py file

1

#148 opened about 2 months ago by

Tonic

MoritzLaurer

posted an update about 2 months ago

Post

2664

Microsoft's rStar-Math paper claims that 🤏 ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from HF中国镜像站 !

📏 The paper introduces rStar-Math, which claims to rival OpenAI o1's math reasoning capabilities by integrating Monte Carlo Tree Search (MCTS) with step-by-step verified reasoning trajectories.
🤖 A Process Preference Model (PPM) enables fine-grained evaluation of intermediate steps, improving training data quality.
🧪 The system underwent four rounds of self-evolution, progressively refining both the policy and reward models to tackle Olympiad-level math problems—without GPT-4-based data distillation.
💾 While we wait for the release of code and datasets, you can already download the prompts they used from the HF Hub!

Details and links here 👇
Prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
Templates on the hub: MoritzLaurer/rstar-math-prompts
Prompt-templates collection: MoritzLaurer/prompt-templates-6776aa0b0b8a923957920bb4
Paper: https://arxiv.org/pdf/2501.04519

PeterL1n

authored a paper about 2 months ago

Diffusion Adversarial Post-Training for One-Step Video Generation

Paper • 2501.08316 • Published Jan 14 • 33

MoritzLaurer

posted an update 2 months ago

Post

3294

FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!

📏 The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs.

🤖 Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document.

🧪 The authors tested different prompt templates on held-out data to ensure their generalization.

📚 It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations.

💾 You can now download and reuse these prompt templates via the prompt-templates library!

🔄 The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this!

Links 👇
- prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
- all templates on the HF Hub: MoritzLaurer/facts-grounding-prompts
- FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf

hysts

in zero-gpu-explorers/README 2 months ago

use authentication in huggingface Gradio API!!!(hosting on ZeroGPU)

41

#129 opened 4 months ago by

Nerva1228

MoritzLaurer

posted an update 2 months ago

Post

1730

The TRL v0.13 release is 🔥! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning. Perfect for tasks like stepwise reasoning.

🔀 Model merging: A new callback leverages mergekit to merge models during training, improving performance by blending reference and policy models - optionally pushing merged models to the HF中国镜像站 Hub.

🛠️ Tool call support: TRL preprocessing now supports tool integration, laying the groundwork for agent fine-tuning with examples like dynamic temperature fetching in prompts.

⚖️ Mixture of judges: The new AllTrueJudge combines decisions from multiple binary judges for more nuanced evaluation.

Read the release notes and other resources here 👇
Release: https://github.com/huggingface/trl/releases/tag/v0.13.0
Mergekit: https://github.com/arcee-ai/mergekit
Mixture of judges paper: The Perfect Blend: Redefining RLHF with Mixture of Judges (2409.20370)

Tonic

in zero-gpu-explorers/README 2 months ago

ZeroGPU Decorator does not work with gr.load()

#147 opened 2 months ago by

Tonic

mrfakename

in zero-gpu-explorers/README 2 months ago

Try to use FlUX 3D Style Image Generator

2

#146 opened 2 months ago by

123456qwqe

MoritzLaurer

posted an update 2 months ago

Post

2081

OpenAI is losing money on the $200/month subscription 🤯. It's crazy how expensive it is to run these largest LLMs:

- ChatGPT Pro costs $200/month ($2,400/year) and is still unprofitable for OpenAI due to higher-than-expected usage.
- OpenAI reportedly expected losses of about $5 billion on revenue of $3.7 billion last year, with ChatGPT alone once costing an estimated $700,000 per day to operate. 💸🔥
- They build strong models and do great research. Whether this business model will work in the long run is one of the biggest questions in the AI economy today.

Source with the numbers 👇
https://techcrunch.com/2025/01/05/openai-is-losing-money-on-its-pricey-chatgpt-pro-plan-ceo-sam-altman-says/

3 replies

·

alielfilali01

posted an update 2 months ago

Post

2057

3C3H AraGen Leaderboard welcomes today deepseek-ai/DeepSeek-V3 and 12 other models (including the late gpt-3.5 💀) to the ranking of best LLMs in Arabic !

Observations:
- DeepSeek-v3 ranked 3rd and only Open model among the top 5 !

- A 14B open model ( Qwen/Qwen2.5-14B-Instruct) outperforms gpt-3.5-turbo-0125 (from last year). This shows how much we came in advancing and supporting Arabic presence within the LLM ecosystem !

- Contrary to what observed in likelihood-acc leaderboards (like OALL/Open-Arabic-LLM-Leaderboard) further finetuned models like maldv/Qwentile2.5-32B-Instruct actually decreased the performance compared to the original model Qwen/Qwen2.5-32B-Instruct.
It's worth to note that the decrease is statiscally insignificant which imply that at best, the out-domain finetuning do not really hurts the model original capabilities acquired during pretraining.
Previous work addressed this (finetuning VS pretraining) but more investigation in this regard is required (any PhDs here ? This could be your question ...)

Check out the latest rankings: inceptionai/AraGen-Leaderboard

MoritzLaurer

posted an update 2 months ago

Post

2235

🚀 Releasing a new zeroshot-classifier based on ModernBERT! Some key takeaways:

- ⚡ Speed & efficiency: It's multiple times faster and uses significantly less memory than DeBERTav3. You can use larger batch sizes and enabling bf16 (instead of fp16) gave me a ~2x speed boost as well
- 📉 Performance tradeoff: It performs slightly worse than DeBERTav3 on average across my zeroshot classification task collection
- 🧠 Use cases: I recommend using it for scenarios requiring speed and a larger context window (8k).
- 💡 What’s next? I’m preparing a newer version trained on better + longer synthetic data to fully leverage the 8k context window and improve upon the training mix of my older zeroshot-v2.0 models. I also hope that there will be a multilingual variant in the future.

Great work by https://huggingface.co/answerdotai !

If you’re looking for a high-speed zeroshot classifier, give it a try!

📄 Resources below: 👇
Base model: MoritzLaurer/ModernBERT-base-zeroshot-v2.0
Large model: MoritzLaurer/ModernBERT-large-zeroshot-v2.0
Updated zeroshot collection: MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f
ModernBERT collection with paper: answerdotai/modernbert-67627ad707a4acbf33c41deb

ZeroGPU Explorers

AI & ML interests

Recent Activity

zero-gpu-explorers's activity

Multiple zeroGPU calls in same code

Update README.md

README

Update README.md

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Zero GPU not picking up app.py file

Diffusion Adversarial Post-Training for One-Step Video Generation

use authentication in huggingface Gradio API!!!(hosting on ZeroGPU)

ZeroGPU Decorator does not work with gr.load()

Try to use FlUX 3D Style Image Generator

AI & ML interests

Recent Activity

Team members 753

zero-gpu-explorers's activity

Multiple zeroGPU calls in same code

Update README.md

README

Update README.md

Zero GPU not picking up app.py file

use authentication in huggingface Gradio API!!!(hosting on ZeroGPU)

ZeroGPU Decorator does not work with gr.load()

Try to use FlUX 3D Style Image Generator