Adina Yakefu
AdinaY
AI & ML interests
None yet
Recent Activity
updated
a Space
3 minutes ago
zh-ai-community/china-ai-policy-research
Organizations
AdinaY's activity

reacted to
clem's
post with 🚀🤗
about 3 hours ago

posted
an
update
about 19 hours ago
Post
626
SEA-VL🔥 an OPEN dataset bridge the AI culture gap in Southeast Asia!
Paper: Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia (2503.07920)
✨1.28M+ culturally relevant images
✨85% accuracy in auto-collected images
✨Tracking underrepresented SEA languages & traditions
Paper: Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia (2503.07920)
✨1.28M+ culturally relevant images
✨85% accuracy in auto-collected images
✨Tracking underrepresented SEA languages & traditions

posted
an
update
1 day ago
Post
876
Open Sora 2.0 is out 🔥
hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5
✨ 11B with Apache2.0
✨ Low training cost - $200k
✨ open weights, code and training workflow
hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5
✨ 11B with Apache2.0
✨ Low training cost - $200k
✨ open weights, code and training workflow

reacted to
clefourrier's
post with 🚀
2 days ago
Post
1498
Gemma3 family is out! Reading the tech report, and this section was really interesting to me from a methods/scientific fairness pov.
Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)
For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)
Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!
Because if your model knows its evals by heart, you're not testing for generalization.
Instead of doing over-hyped comparisons, they clearly state that **results are reported in a setup which is advantageous to their models**.
(Which everybody does, but people usually don't say)
For a tech report, it makes a lot of sense to report model performance when used optimally!
On leaderboards on the other hand, comparison will be apples to apples, but in a potentially unoptimal way for a given model family (like some user interact sub-optimally with models)
Also contains a cool section (6) on training data memorization rate too! Important to see if your model will output the training data it has seen as such: always an issue for privacy/copyright/... but also very much for evaluation!
Because if your model knows its evals by heart, you're not testing for generalization.

reacted to
thomwolf's
post with 🚀🔥
2 days ago
Post
1822
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.
And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)
It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!
And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3
Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)
It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!
And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3
Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

posted
an
update
2 days ago
Post
1158
Spark TTS 🔊New OPEN TTS model that can generate any voice with just seconds of audio!
Released by SparkAudio community🔥
Model👉 SparkAudio/Spark-TTS-0.5B
Paper👉 Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens (2503.01710)
✨ Supports English & Chinese
✨ BiCodec Speech Codec: Enables precise voice control by separating semantics & speaker attributes
Released by SparkAudio community🔥
Model👉 SparkAudio/Spark-TTS-0.5B
Paper👉 Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens (2503.01710)
✨ Supports English & Chinese
✨ BiCodec Speech Codec: Enables precise voice control by separating semantics & speaker attributes

posted
an
update
2 days ago
Post
1241
R1-Omni🔥RLVR-Powered Multimodal LLM released by Alibaba
Model: StarJiaxing/R1-Omni-0.5B
Paper: R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning (2503.05379)
✨0.5B with Apache2.0
✨ Improve emotion recognition with visual and audio cues
Model: StarJiaxing/R1-Omni-0.5B
Paper: R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning (2503.05379)
✨0.5B with Apache2.0
✨ Improve emotion recognition with visual and audio cues

posted
an
update
8 days ago
Post
2246
Babel🗼A multilingual LLM supporting 25 languages, released by the Alibaba DAMO team.
Model: Tower-Babel/babel-67c172157372d4d6c4b4c6d5
Paper: Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers (2503.00865)
✨ 9B/83B chat & base
✨ Supports 25 languages: English, Chinese, Hindi, Spanish, Arabic, French, Bengali, Portuguese, Russian, Urdu, Indonesian, German, Japanese, Swahili, Filipino, Tamil, Vietnamese, Turkish, Italian, Javanese, Korean, Hausa, Persian, Thai, and Burmese
Model: Tower-Babel/babel-67c172157372d4d6c4b4c6d5
Paper: Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers (2503.00865)
✨ 9B/83B chat & base
✨ Supports 25 languages: English, Chinese, Hindi, Spanish, Arabic, French, Bengali, Portuguese, Russian, Urdu, Indonesian, German, Japanese, Swahili, Filipino, Tamil, Vietnamese, Turkish, Italian, Javanese, Korean, Hausa, Persian, Thai, and Burmese

reacted to
clem's
post with 🔥
10 days ago
Post
5871
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using HF中国镜像站, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!
Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise
Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise

posted
an
update
10 days ago
Post
1661
Qilin 🔥a large scale multimodal dataset for search, recommendation and RAG research, released by Xiaohongshu & Tsinghua University
Dataset: THUIR/Qilin
Paper: Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions (2503.00501)
✨Multiple content modalities (text, images, video thumbnails)
✨Rich user interaction data ( from Xiaohongshu’s 300M+ MAUs, 70%+ search penetration)
✨Comprehensive evaluation metrics
✨Support for RAG system development
Dataset: THUIR/Qilin
Paper: Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions (2503.00501)
✨Multiple content modalities (text, images, video thumbnails)
✨Rich user interaction data ( from Xiaohongshu’s 300M+ MAUs, 70%+ search penetration)
✨Comprehensive evaluation metrics
✨Support for RAG system development

posted
an
update
10 days ago
Post
2755
CogView-4 is out🔥🚀 The SoTa OPEN text to image model by ZhipuAI
Model: THUDM/CogView4-6B
Demo: THUDM-HF-SPACE/CogView4
✨ 6B with Apache2.0
✨ Supports Chinese & English Prompts by ANY length
✨ Generate Chinese characters within images
✨ Creates images at any resolution within a given range
Model: THUDM/CogView4-6B
Demo: THUDM-HF-SPACE/CogView4
✨ 6B with Apache2.0
✨ Supports Chinese & English Prompts by ANY length
✨ Generate Chinese characters within images
✨ Creates images at any resolution within a given range

posted
an
update
11 days ago
Post
3996
Exciting releases from the Chinese community this February🔥
👉 zh-ai-community/2025-february-67a35aaa68e97812def5b6ef
MLLM:
✨ Ovis2 by Alibaba
AIDC-AI/ovis2-67ab36c7e497429034874464
✨ Step Audio Chat by StepFun AI
stepfun-ai/step-audio-67b33accf45735bb21131b0b
Audio:
✨ Step Audio TTS by StepFunAI
stepfun-ai/Step-Audio-TTS-3B
✨ InspireMusic by Alibaba
https://huggingface.co/FunAudioLLM
✨ Baichuan Audio by BaichuanAI
baichuan-inc/Baichuan-Audio-Instruct
Video:
✨ Wan2.1 by Alibaba_Wan
Wan-AI/Wan2.1-T2V-14B
✨ Stepvideo-T2V by StepFun AI
stepfun-ai/stepvideo-t2v
✨ SkyReels-V1 by Skywork
Skywork/skyreels-v1-67b34676ff65b4ec02d16307
✨ LLaDA-8B by RenminUniversity
GSAI-ML/LLaDA-8B-Instruct
MoE:
✨ Moonlight-16B by MoonshotAI (Kimi)
moonshotai/Moonlight-16B-A3B-Instruct
Reasoning:
✨ TinyR1-32B by Qihoo360
qihoo360/TinyR1-32B-Preview
Dataset:
✨ Chinese DeepSeek R1-Distill data -110k
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
👉 zh-ai-community/2025-february-67a35aaa68e97812def5b6ef
MLLM:
✨ Ovis2 by Alibaba
AIDC-AI/ovis2-67ab36c7e497429034874464
✨ Step Audio Chat by StepFun AI
stepfun-ai/step-audio-67b33accf45735bb21131b0b
Audio:
✨ Step Audio TTS by StepFunAI
stepfun-ai/Step-Audio-TTS-3B
✨ InspireMusic by Alibaba
https://huggingface.co/FunAudioLLM
✨ Baichuan Audio by BaichuanAI
baichuan-inc/Baichuan-Audio-Instruct
Video:
✨ Wan2.1 by Alibaba_Wan
Wan-AI/Wan2.1-T2V-14B
✨ Stepvideo-T2V by StepFun AI
stepfun-ai/stepvideo-t2v
✨ SkyReels-V1 by Skywork
Skywork/skyreels-v1-67b34676ff65b4ec02d16307
✨ LLaDA-8B by RenminUniversity
GSAI-ML/LLaDA-8B-Instruct
MoE:
✨ Moonlight-16B by MoonshotAI (Kimi)
moonshotai/Moonlight-16B-A3B-Instruct
Reasoning:
✨ TinyR1-32B by Qihoo360
qihoo360/TinyR1-32B-Preview
Dataset:
✨ Chinese DeepSeek R1-Distill data -110k
Congliu/Chinese-DeepSeek-R1-Distill-data-110k

reacted to
fdaudens's
post with 🔥
14 days ago
Post
3432
What if AI becomes as ubiquitous as the internet, but runs locally and transparently on our devices?
Fascinating TED talk by @thomwolf on open source AI and its future impact.
Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.
This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.
Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works
Fascinating TED talk by @thomwolf on open source AI and its future impact.
Imagine this for AI: instead of black box models running in distant data centers, we get transparent AI that runs locally on our phones and laptops, often without needing internet access. If the original team moves on? No problem - resilience is one of the beauties of open source. Anyone (companies, collectives, or individuals) can adapt and fix these models.
This is a compelling vision of AI's future that solves many of today's concerns around AI transparency and centralized control.
Watch the full talk here: https://www.ted.com/talks/thomas_wolf_what_if_ai_just_works

posted
an
update
14 days ago
Post
529
The AI race in the automotive industry is heating up🚗
Li Auto’s research team has released their latest paper on LLM👇 LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation (2502.18302)
✨This paper introduces LDGen, which integrates LLMs with diffusion models to enhance text-to-image (T2I) generation capabilities.
Li Auto’s research team has released their latest paper on LLM👇 LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation (2502.18302)
✨This paper introduces LDGen, which integrates LLMs with diffusion models to enhance text-to-image (T2I) generation capabilities.

posted
an
update
14 days ago
Post
475
LLaDA 🔥a 8B diffusion model by GSAI Lab Renmin University
✨Fully trained from scratch, LLaDA delivers performance on par with LLaMA3 8B
Model: GSAI-ML/LLaDA-8B-Instruct
Demo: multimodalart/LLaDA
Paper: Large Language Diffusion Models (2502.09992)
✨Fully trained from scratch, LLaDA delivers performance on par with LLaMA3 8B
Model: GSAI-ML/LLaDA-8B-Instruct
Demo: multimodalart/LLaDA
Paper: Large Language Diffusion Models (2502.09992)

reacted to
burtenshaw's
post with 🔥
16 days ago
Post
6204
Now the HF中国镜像站 agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.
🔗 Follow the org for updates https://huggingface.co/agents-course
This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:
- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use
The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .
🔗 Follow the org for updates https://huggingface.co/agents-course
This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:
- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use
The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .

reacted to
freddyaboulton's
post with 🔥🚀
16 days ago
Post
3157
Getting WebRTC and Websockets right in python is very tricky. If you've tried to wrap an LLM in a real-time audio layer then you know what I'm talking about.
That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.
Check out our org: hf.co/fastrtc
That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.
Check out our org: hf.co/fastrtc