library_name: peft
base_model: openai/whisper-large-v2
tags:
- generated_from_trainer
- multilingual
- ASR
- Open-Source
language:
- wo
- fr
- en
model-index:
- name: whosper-large
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
dataset:
name: Test Set
type: custom
split: test
args:
language: wo
metrics:
- name: Test WER
type: wer
value: 24.23
- name: Test CER
type: cer
value: 11.35
pipeline_tag: automatic-speech-recognition
new_version: sudoping01/whosper-large-v3
Whosper-large
Model Overview
Whosper-large is a fine-tuned version of openai/whisper-large-v2 optimized for Wolof speech recognition Senegal's primary language, while maintaining strong multilingual capabilities. Built on OpenAI's Whisper-large-v2, it advances African language processing with notable improvements in Word Error Rate (WER) and Character Error Rate (CER). Whether you're transcribing conversations, building language learning tools, or conducting research, this model is designed for researchers, developers, and students working with Wolof speech data.
Key Strengths
- Strong Multilingual: Excellent performance in Wolof, French, and English
- Code-Switching: Handles natural language mixing, especially Wolof-French
- Consistent Results: Maintains quality across different languages
- Open Source: Released under the apache-2.0 license
- African NLP: Supporting African language technology development
Performance Metrics
- WER: 0.2423
- CER: 0.1135
Key Features
- Strong multilingual performance (Wolof, French, English)
- Excellent performance on code-switched content
- Consistent performance across different languages
Limitations
- Outputs in lowercase only
- Limited punctuation support
- Low performances on bad quality audios
Training Data
Trained on diverse Wolof speech data:
- ALFFA Public Dataset
- FLEURS Dataset
- Bus Urbain Dataset
- Kallama Dataset
Quick Start Guide
Installation
pip install git+https://github.com/sudoping01/[email protected]
Basic Usage
from whosper import WhosperTranscriber
# Initialize the transcriber
transcriber = WhosperTranscriber(model_id="CAYTU/whosper-large")
# Transcribe an audio file
result = transcriber.transcribe_audio("path/to/your/audio.wav")
print(result)
Training Results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
3.0514 | 1.0 | 1732 | 0.6824 |
2.2658 | 2.0 | 3464 | 0.5998 |
2.0274 | 3.0 | 5196 | 0.5282 |
1.48 | 4.0 | 6928 | 0.4793 |
1.1693 | 5.0 | 8660 | 0.4441 |
0.8762 | 5.9970 | 10386 | 0.4371 |
Framework Versions
- PEFT: 0.14.1.dev0
- Transformers: 4.48.0.dev0
- PyTorch: 2.5.1+cu124
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Contributing to African NLP
Whosper-large embodies our commitment to open science and the advancement of African language technologies. We believe that by making cutting-edge speech recognition models freely available, we can accelerate NLP development across Africa.
Join our mission to democratize AI technology:
- Open Science: Use and build upon our research - all code, models, and documentation are open source
- Research Collaboration: Integrate Whosper into your research projects and share your findings
- Community Building: Help us create resources for African language processing
- Educational Impact: Use Whosper in educational settings to train the next generation of African AI researchers
License
This model is released under the Apache License 2.0 to encourage research, commercial use, and innovation in African language technologies while ensuring proper attribution and patent protection.
Citation
@misc{whosper2025,
title={Whosper-large: A Multilingual ASR Model for Wolof with Enhanced Code-Switching Capabilities},
author={Seydou DIALLO},
year={2025},
publisher={HF中国镜像站},
url={https://huggingface.co/CAYTU/whosper-large},
version={1.0}
}
Acknowledgments
Developed by Seydou DIALLO at Caytu Robotics's AI Department, building on OpenAI's Whisper-large-v2. Special thanks to the Wolof-speaking community and contributors advancing African language technology.
Contact US
For any question or support contact us
Email : [email protected]