Answer.AI

company

https://www.answer.ai

AnswerDotAI

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

bwarner updated a model about 21 hours ago

answerdotai/ModernBERT-large-training-checkpoints

tomaarsen new activity 8 days ago

answerdotai/ModernBERT-base:Conversion to ONNX

tomaarsen authored a paper 20 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

View all activity

Articles

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 572

answerdotai's activity

bwarner

updated a model about 21 hours ago

answerdotai/ModernBERT-large-training-checkpoints

Updated about 21 hours ago

freddyaboulton

posted an update 2 days ago

Post

1590

Privacy matters when talking to AI! 🔇

We've just added a microphone mute button to FastRTC in our latest update (v0.0.14). Now you control exactly what your LLM hears.

Plus lots more features in this release! Check them out:
https://github.com/freddyaboulton/fastrtc/releases/tag/0.0.14

tomaarsen

posted an update 3 days ago

Post

6193

An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.

🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi
3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion
➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common.
⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported.
🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models
📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight.
📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code.

Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release
* EuroBERT/EuroBERT-210m
* EuroBERT/EuroBERT-610m
* EuroBERT/EuroBERT-2.1B

The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!

1 reply

tomaarsen

in answerdotai/ModernBERT-base 8 days ago

Conversion to ONNX

#71 opened 8 days ago by

mph

freddyaboulton

posted an update 16 days ago

Post

3151

Getting WebRTC and Websockets right in python is very tricky. If you've tried to wrap an LLM in a real-time audio layer then you know what I'm talking about.

That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.

Check out our org: hf.co/fastrtc

tomaarsen

authored a paper 20 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published 22 days ago • 32

bwarner

updated a model about 1 month ago

answerdotai/ModernBERT-base-training-checkpoints

Updated about 1 month ago • 2

bwarner

published 2 models about 1 month ago

answerdotai/ModernBERT-base-training-checkpoints

Updated about 1 month ago • 2

answerdotai/ModernBERT-large-training-checkpoints

Updated about 21 hours ago

bclavie

updated a model about 1 month ago

answerdotai/ModernBERT-Large-Instruct

Fill-Mask • Updated Feb 10 • 3.97k • 18

bclavie

published a model about 1 month ago

answerdotai/ModernBERT-Large-Instruct

Fill-Mask • Updated Feb 10 • 3.97k • 18

tomaarsen

in answerdotai/ModernBERT-base about 1 month ago

nan or 0.0 loss when training with flash attention

#59 opened about 1 month ago by

roadtoagi

tomaarsen

in answerdotai/ModernBERT-base about 2 months ago

Import fails on AWS lamba instance.

#55 opened about 2 months ago by

obeijbom

ModernBERT fails to work without FlashAttention !

#56 opened about 2 months ago by

benhachem

tomaarsen

posted an update about 2 months ago

Post

2206

I just released Sentence Transformers v3.4.0, featuring a memory leak fix, compatibility between the powerful Cached... losses and the Matryoshka loss modifier, and a bunch of fixes & small features.

🪆 Matryoshka & Cached loss compatibility
It is now possible to combine the powerful Cached... losses (which use in-batch negatives & a caching mechanism to allow for endless batch size & negatives) with the Matryoshka loss modifier which modifies a base loss such that it is trained not only on the maximum dimensionality (e.g. 1024 dimensions), but also on many lower dimensions (e.g. 768, 512, 256, 128, 64, 32).
After training, these models' embeddings can be truncated for faster retrieval, etc.

🎞️ Resolve memory leak when Model and Trainer are reinitialized
Due to a circular dependency between Trainer -> Model -> ModelCardData -> Trainer, deleting both the trainer & model still didn't free up the memory.
This led to a memory leak in scripts where you repeatedly do so.

➕ New Features
Many new small features, e.g. multi-GPU support for 'mine_hard_negatives', a 'margin' parameter to TripletEvaluator, and Matthews Correlation Coefficient in the BinaryClassificationEvaluator.

🐛 Bug Fixes
Also a bunch of fixes, for example that subsequent batches were not sorted when using the "no_duplicates" batch sampler. See the release notes for more details.

Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.4.0

Big thanks to all community members who assisted in this release. 10 folks with their first contribution this time around!