Stefan Schweter's picture

Stefan Schweter PRO

stefan-it

AI & ML interests

Flair Library πŸ’•, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models

Recent Activity

Organizations

Bayerische Staatsbibliothek's profile picture flair's profile picture Flax Community's profile picture dumitrescustefan-org's profile picture GermanT5's profile picture BigScience: LMs for Historical Texts's profile picture BigLAM: BigScience Libraries, Archives and Museums's profile picture Universal NER's profile picture Libre Euro Lingua-Alliance's profile picture Lang UK's profile picture BabyLM Challenge's profile picture hmByT5's profile picture hmByT5 Preliminary's profile picture Blog-explorers's profile picture German Wikipedia LMs's profile picture hmBERT's profile picture hmTEAMS's profile picture HIPE's profile picture hmBERT Tiny's profile picture hmBERT 64k's profile picture LSV @ Saarland University's profile picture GERMATRON's profile picture PleIAs's profile picture German LLM Tokenizers's profile picture Social Post Explorers's profile picture Occiglot's profile picture GERTuraX's profile picture Stefmal's profile picture HFδΈ­ε›½ι•œεƒη«™ Discord Community's profile picture ScaDS.AI German LLM's profile picture ENGEBA's profile picture Nerdy Face's profile picture TensorFlow Model Garden LMs's profile picture

stefan-it's activity

New activity in stefan-it/neobert-ner-conll03 8 days ago

Update model.py

3
#1 opened 11 days ago by
KoichiYasuoka
reacted to csabakecskemeti's post with πŸ”₯ 10 days ago
view post
Post
1938
-UPDATED-
4bit inference is working! The blogpost is updated with code snippet and requirements.txt
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
-UPDATED-
I've played around with an MI100 and ROCm and collected my experience in a blogpost:
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
Unfortunately I've could not make inference or training work with model loaded in 8bit or use BnB, but did everything else and documented my findings.
  • 4 replies
Β·
replied to their post 10 days ago
view reply

Let's see if BERT5urk can make it into @merve 's weekly recap of open AI πŸ€—

posted an update 10 days ago
view post
Post
836
πŸ‡ΉπŸ‡· 😍 I'm very happy to finally announce my new Turkish LM called "BERT5urk":

stefan-it/bert5urk

It is a 1.42B T5-based model, trained with UL2 pretraining objective on the Turkish part of the awesome HuggingFaceFW/fineweb-2 dataset.

Feel free to check it out!
  • 1 reply
Β·
New activity in stefan-it/bert5urk 10 days ago