忍者

byteprobe

AI & ML interests

RL | NLP | LLM | multimodal | agent

Recent Activity

upvoted an article about 11 hours ago

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

upvoted an article about 11 hours ago

SigLIP 2: A better multilingual vision language encoder

upvoted an article about 11 hours ago

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

View all activity

Organizations

byteprobe's activity

upvoted 3 articles about 11 hours ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

2 days ago

• 231

Article

SigLIP 2: A better multilingual vision language encoder

21 days ago

• 134

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Feb 4

• 113

upvoted 3 papers about 11 hours ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published 18 days ago • 68

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published 24 days ago • 67

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published 15 days ago • 77

liked 2 models about 11 hours ago

CohereForAI/c4ai-command-a-03-2025

Text Generation • Updated about 13 hours ago • 726 • 176

RekaAI/reka-flash-3

Updated about 20 hours ago • 1.99k • 232

upvoted a paper about 11 hours ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 8 days ago • 87

upvoted a paper about 12 hours ago

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published 22 days ago • 93

upvoted a paper about 23 hours ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published 22 days ago • 85

upvoted a paper 1 day ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 22 days ago • 97

liked a Space 1 day ago

2.24k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked 2 datasets 1 day ago

TIGER-Lab/MMLU-Pro

Viewer • Updated Nov 27, 2024 • 12.1k • 44k • 330

SynthLabsAI/Big-Math-RL-Verified

Viewer • Updated 7 days ago • 251k • 5.6k • 149

upvoted a paper 1 day ago

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published 29 days ago • 184

liked a model 1 day ago

tencent/HunyuanVideo-I2V

Image-to-Video • Updated about 24 hours ago • 2.3k • 245

liked a dataset 1 day ago

gaia-benchmark/GAIA

Updated 29 days ago • 9.14k • 259

liked a model 1 day ago

google/gemma-3-27b-it

Image-Text-to-Text • Updated 2 days ago • 85.2k • 526

upvoted a paper 1 day ago

Large Language Diffusion Models

Paper • 2502.09992 • Published 28 days ago • 103