BigScience Data

non-profit

https://bigscience.huggingface.co

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

crystina-z authored a paper 3 days ago

Spacerini: Plug-and-play Search Engines with Pyserini and HF中国镜像站

hen authored a paper 3 days ago

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

crystina-z authored a paper 3 days ago

Zero-Shot Listwise Document Reranking with a Large Language Model

View all activity

bigscience-data's activity

thomwolf

posted an update 1 day ago

Post

1257

We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

crystina-z

authored a paper 3 days ago

Spacerini: Plug-and-play Search Engines with Pyserini and HF中国镜像站

Paper • 2302.14534 • Published Feb 28, 2023

hen

authored a paper 3 days ago

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Paper • 1811.10597 • Published Nov 26, 2018

crystina-z

authored 2 papers 3 days ago

Zero-Shot Listwise Document Reranking with a Large Language Model

Paper • 2305.02156 • Published May 3, 2023 • 1

Evaluating Embedding APIs for Information Retrieval

Paper • 2305.06300 • Published May 10, 2023 • 1

hen

authored 2 papers 3 days ago

Semantic Photo Manipulation with a Generative Image Prior

Paper • 2005.07727 • Published May 15, 2020

Understanding the Role of Individual Units in a Deep Neural Network

Paper • 2009.05041 • Published Sep 10, 2020

crystina-z

authored 2 papers 3 days ago

GAIA Search: HF中国镜像站 and Pyserini Interoperability for NLP Training Data Exploration

Paper • 2306.01481 • Published Jun 2, 2023 • 1

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

Paper • 2311.18812 • Published Nov 30, 2023

hen

authored a paper 3 days ago

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

Paper • 2102.01672 • Published Feb 2, 2021

crystina-z

authored 3 papers 3 days ago

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

Paper • 2312.11361 • Published Dec 18, 2023 • 1

Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval

Paper • 2108.08787 • Published Aug 19, 2021

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

Paper • 2307.16883 • Published Jul 31, 2023

hen

authored a paper 3 days ago

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

Paper • 2404.03214 • Published Apr 4, 2024 • 2

crystina-z

authored 3 papers 3 days ago

Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages

Paper • 2210.09984 • Published Oct 18, 2022 • 2

FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

Paper • 2406.11030 • Published Jun 16, 2024

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

Paper • 2310.07712 • Published Oct 11, 2023

hen

authored a paper 3 days ago

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Paper • 2403.06009 • Published Mar 9, 2024

crystina-z

authored a paper 3 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published 22 days ago • 32

albertvillanova

posted an update 6 days ago

Post

3515

🚀 New smolagents update: Safer Local Python Execution! 🦾🐍

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒

Here's why this matters & what you need to know! 🧵👇

1️⃣ Why is local execution risky? ⚠️
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2️⃣ New Safety Layer in smolagents 🛡️
We now inspect every return value during execution:
✅ Allowed: Safe built-in types (e.g., numbers, strings, lists)
⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3️⃣ Immediate Benefits 💡
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4️⃣ Security Disclaimer ⚠️
🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5️⃣ The Best Practice: Use Sandboxed Execution 🔐
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6️⃣ Upgrade Now & Stay Safe! 🚀
Check out the latest smolagents release and start building safer AI agents today.

🔗 https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Let’s discuss! 👇

#AI #smolagents #Python #Security

2 replies

AI & ML interests

Recent Activity

Team members 72

bigscience-data's activity