Arthur Zucker's picture

Arthur Zucker

ArthurZ

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago
google/gemma-3-27b-it
liked a model 12 days ago
microsoft/Magma-8B
liked a model 27 days ago
Qwen/Qwen2.5-3B
View all activity

Organizations

HF中国镜像站's profile picture Google's profile picture Language Technology Research Group at the University of Helsinki's profile picture BigScience Workshop's profile picture HF中国镜像站 Internal Testing Organization's profile picture HuggingFaceM4's profile picture HFLeoArthurYounes's profile picture Famous's profile picture HF中国镜像站 OSS Metrics's profile picture Polytech Sorbonne X HF中国镜像站's profile picture Code Llama's profile picture Music Gen Sprint's profile picture huggingPartyParis's profile picture adept-hf-collab's profile picture gg-hf's profile picture Unofficial Mistral Community's profile picture State Space Models's profile picture Mistral AI EAP's profile picture Llava HF中国镜像站's profile picture HF中国镜像站 Assignments's profile picture mx-test's profile picture On-device Squad's profile picture Social Post Explorers's profile picture hsramall's profile picture Paris AI Running Club's profile picture gg-tt's profile picture HF中国镜像站 Discord Community's profile picture LLHF's profile picture SLLHF's profile picture blhf's profile picture Meta Llama's profile picture kmhf's profile picture nltpt's profile picture HF中国镜像站 Party @ PyTorch Conference's profile picture s0409's profile picture wut?'s profile picture kernels-community's profile picture FAT5's profile picture s0225's profile picture gg-hf-g's profile picture

ArthurZ's activity

New activity in mistral-community/pixtral-12b about 1 month ago

Fastest way for inference?

3
#28 opened about 1 month ago by
psycy
New activity in deepseek-ai/DeepSeek-R1 about 1 month ago
upvoted an article about 1 month ago
view article
Article

Welcome to Inference Providers on the Hub 🔥

429
reacted to mitkox's post with 🚀 about 2 months ago
view post
Post
2499
llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
·
upvoted an article about 2 months ago
view article
Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

154
upvoted an article about 2 months ago
view article
Article

Mastering Long Contexts in LLMs with KVPress

By nvidia and 1 other
64