--- license: mit language: - en - pt - es - zh - nl - fr - de - it - ja - pl pipeline_tag: audio-to-audio tags: - audio - voice - voice conversion - singing voice conversion - vc - svc - multilingual --- # FreeSVC: Zero-shot Multilingual Singing Voice Conversion **FreeSVC** is a state-of-the-art multilingual singing voice conversion model designed for zero-shot learning. It enables the conversion of singing voices across various languages without the need for extensive language-specific training. [GitHub repository](https://github.com/freds0/free-svc). ## Supported Languages | Language | ID | Status | Speech Data | Singing Data | |------------|-----|--------------|-------------|--------------| | Chinese | 0 | ✅ Full | 255h | 70h | | Dutch | 1 | ✅ Full | Part of CML | - | | English | 2 | ✅ Full | 921h | 47h | | French | 3 | ✅ Full | Part of CML | - | | German | 4 | ✅ Full | Part of CML | - | | Italian | 5 | ✅ Full | Part of CML | - | | Japanese | 6 | ✅ Full | 30h | - | | Other* | 7 | ⚠️ Partial | - | 10h | | Polish | 8 | ✅ Full | Part of CML | - | | Portuguese | 9 | ✅ Full | Part of CML | - | | Spanish | 10 | ✅ Full | Part of CML | - | *Note: The "Other" category is used for vocal techniques without content. ## Model Overview FreeSVC leverages an enhanced VITS architecture integrated with Speaker-invariant Clustering (SPIN) and the ECAPA2 speaker encoder. This combination effectively separates speaker characteristics from linguistic content, ensuring high-quality and natural-sounding voice conversions across multiple languages. ## Training Datasets FreeSVC was trained on a diverse set of speech and singing datasets covering multiple languages: | **Dataset** | **Hours** | **Language** | **Type** | |----------------------|------------|--------------|--------------| | AISHELL-1 | 170h | Chinese | Speech | | AISHELL-3 | 85h | Chinese | Speech | | CML-TTS | 3.1k | 7 Languages | Speech | | HiFiTTS | 292h | English | Speech | | JVS | 30h | Japanese | Speech | | LibriTTS-R | 585h | English | Speech | | NUS (NHSS) | 7h | English | Speech, Singing | | OpenSinger | 50h | Chinese | Singing | | Opencpop | 5h | Chinese | Singing | | PopBuTFy | 10h, 40h | Chinese, English | Singing | | POPCS | 5h | Chinese | Singing | | VCTK | 44h | English | Speech | | VocalSet | 10h | Other | Singing | ## Citation ``` @misc{} ```