| | --- |
| | library_name: transformers |
| | base_model: SparkAudio/Spark-TTS-0.5B |
| |
|
| | tags: |
| | - text-to-speech |
| | - tts |
| | - spark-tts |
| | - llm-based-tts |
| | - bambara |
| | - african-languages |
| | - open-source |
| | - mali |
| | - maliba-ai |
| | - text-generation-inference |
| | - transformers |
| | - unsloth |
| |
|
| | extra_gated_fields: |
| | Name: text |
| | Official Email (organization or academic email): text |
| | Affiliation (University, Research Lab, etc): text |
| | I confirm I am a researcher, student, or member of a non-profit organization: checkbox |
| | I confirm I am NOT affiliated with a for-profit company and will not use this model on behalf of one: checkbox |
| | I have read and agree to the MALIBA-AI Research License (Non-Commercial, Non-Profit Use Only): checkbox |
| | I agree to cite the MALIBA-AI paper and all original references when using this model: checkbox |
| |
|
| |
|
| | language: |
| | - bm |
| | language_bcp47: |
| | - bm-ML |
| |
|
| | model-index: |
| | - name: bambara-tts |
| | results: |
| | - task: |
| | name: text-to-speech |
| | type: speech-synthesis |
| | metrics: |
| | - name: Subjective Quality (MOS) |
| | type: mos |
| | value: 4.2 |
| | - name: Speaker Similarity |
| | type: similarity |
| | value: High |
| | - name: Naturalness |
| | type: naturalness |
| | value: 4.1 |
| |
|
| | pipeline_tag: text-to-speech |
| | license: cc-by-nc-sa-4.0 |
| |
|
| | --- |
| | |
| |
|
| |
|
| |
|
| |
|
| | # MALIBA-AI Bambara TTS 🇲🇱 |
| |
|
| | <style> |
| | img { |
| | display: inline; |
| | } |
| | </style> |
| |
|
| | [](#model-architecture) |
| | | [](#model-architecture) |
| | | [](#datasets) |
| | | [](#license) |
| |
|
| | ## Model Overview |
| |
|
| | This model provides neural text-to-speech synthesis for Bambara (Bamanankan), the most widely spoken language in Mali. The model supports 10 authentic Bambara speakers and produces high-fidelity audio without requiring separate vocoder models. It serves over 14 million Bambara speakers across West Africa with native-level pronunciation and cultural authenticity. |
| |
|
| | <!-- - Try our live demo on [Hugging Face Spaces](https://huggingface.co/spaces/MALIBA-AI/BambaraText2Speech) --> |
| | - **Available Speakers:** Adama, Moussa, Bourama, Modibo, Seydou, Amadou, Bakary, Ngolo, Ibrahima, Amara |
| |
|
| | > **⚠️ Non-commercial use only.** One of the training data of the base model `spark-tts` carries a license that restricts commercial use. |
| |
|
| |
|
| | ## Quick Start |
| |
|
| | ### Installation |
| |
|
| |
|
| | <!-- ```bash |
| | pip install maliba-ai==1.1.1b0 |
| | ``` --> |
| |
|
| | <!-- For development installations: --> |
| |
|
| | ```bash |
| | pip install git+https://github.com/MALIBA-AI/bambara-tts.git |
| | ``` |
| | with uv (faster) |
| |
|
| | <!-- ```bash |
| | uv pip install maliba-ai==1.1.1b0 |
| | ``` --> |
| |
|
| | ```bash |
| | uv pip install git+https://github.com/MALIBA-AI/bambara-tts.git |
| | ``` |
| | Note : if you are in colab please install those additional dependencies : |
| |
|
| | ``` |
| | !pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl triton cut_cross_entropy unsloth_zoo |
| | !pip install sentencepiece protobuf huggingface_hub hf_transfer |
| | !pip install --no-deps unsloth |
| | ``` |
| |
|
| | ### Basic Usage |
| |
|
| | ```python |
| | from maliba_ai.tts.inference import BambaraTTSInference |
| | from maliba_ai.config.settings import Speakers |
| | |
| | tts = BambaraTTSInference() |
| | |
| | text = "Aw ni ce. I ka kɛnɛ wa?" |
| | audio = tts.generate_speech(text=text, speaker_id=Speakers.Bourama, output_filename="greeting.wav") |
| | |
| | ``` |
| |
|
| | Note: More detail : https://github.com/sudoping01/bambara-tts/blob/main/README.md |
| |
|
| | A notebook is available on [this link](https://colab.research.google.com/drive/1rJy-mV4Zte33xOWSpkzmY-jBp0T71AFk?usp=sharing), enabling you to test the model quickly. |
| |
|
| | ## Technical Specifications |
| |
|
| | ### Architecture |
| | - **Base Model**: Spark-TTS (LLM-based TTS) |
| | - **Foundation**: Qwen2.5-based language model |
| | - **Parameters**: ~500M |
| | - **Audio Format**: 16kHz, 16-bit PCM mono |
| | - **Language Support**: Bambara (bm-ML) |
| |
|
| |
|
| | ## Model Input/Output |
| |
|
| | ### Input |
| | - **Text**: Bambara text in standard orthography |
| | - **Speaker ID**: Choice of 10 available speakers |
| | - **Parameters**: Temperature, top-k, top-p (optional) |
| |
|
| | ### Output |
| | - **Audio**: 16kHz mono WAV format |
| | - **Quality**: Professional-grade speech synthesis |
| |
|
| | ## ⚠️ Known Limitations |
| |
|
| | ### Language Mixing |
| | - **Issue**: Poor performance with French-Bambara code-switching |
| | - **Recommendation**: Use pure Bambara text for optimal results |
| |
|
| | ### Numeric Content |
| | - **Update**: We now use **[Bambara Text Normalizer](https://pypi.org/project/bambara-text-normalizer/)** to preprocess input text. This allows the TTS model to generate speech for **digits, dates, and times** automatically. |
| | - **Integration**: This normalization is already **built into the MALIBA-AI Bambara TTS Inference framework**, so users do not need to manually convert numbers or dates. |
| | - **Recommendation**: For best results, provide text in standard Bambara orthography, but numeric content and dates are handled automatically. |
| |
|
| |
|
| | ## ⚠️ Disclaimer |
| |
|
| | This model provides high-fidelity Bambara speech synthesis intended for research, education, and community applications. The following uses are **strictly forbidden**: |
| |
|
| | - **Voice Impersonation**: Do not clone voices without explicit consent |
| | - **Deceptive Content**: Do not generate misleading or fraudulent audio |
| | - **Illegal Activities**: Do not use for any unlawful purposes |
| |
|
| | By using this model, you agree to uphold ethical standards and legal responsibilities. We **are not responsible** for any misuse and firmly oppose unethical usage of this technology. |
| |
|
| | If you have concerns about potential misuse or need guidance on ethical applications, please contact us at ml.maliba.ai@gmail.com |
| |
|
| |
|
| | ## License |
| |
|
| | **[MALIBA-AI Research Licence](https://huggingface.co/MALIBA-AI/bambara-tts/blob/main/LICENCE.md)** - Non-commercial use due to some data used in base model training. |
| |
|
| | ### Key Terms (Non-Commercial License) |
| |
|
| | * ✅ **Permitted:** research, academic, educational, and personal use |
| | * ✅ **Required:** clear attribution to the original authors (MALIBA-AI) |
| | * ✅ **Allowed:** modifications and derivative works, **provided they are shared under the same non-commercial, share-alike terms** |
| | * ❌ **Prohibited:** any commercial use, monetization, or revenue-generating activity |
| | * ❌ **Prohibited:** use by companies, startups, or for-profit organizations without prior written permission |
| | * ❌ **Prohibited:** selling, sublicensing, hosting as a paid service, or integrating into commercial products |
| |
|
| | If you have any questions, please contact us at: `ml[dot]maliba[dot]ai[at]gmail.com` |
| |
|
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @software{maliba_ai_bambara_tts, |
| | title={MALIBA-AI Bambara Text-to-Speech: Open-Source High-Quality TTS for Bambara Language}, |
| | author={MALIBA-AI}, |
| | year={2025}, |
| | url={https://huggingface.co/MALIBA-AI/bambara-tts} |
| | } |
| | ``` |
| |
|
| | --- |
| |
|
| | **MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation** |
| |
|
| | *"No Malian Language Left Behind"* |
| | --- |
| |
|
| | **Contact Information:** |
| | - Website: [maliba-ai.org](https://maliba-ai.org) |
| | - Email: ml.maliba.ai@gmail.com |
| | - GitHub: [MALIBA-AI](https://github.com/MALIBA-AI) |
| | - HuggingFace: [MALIBA-AI](https://huggingface.co/MALIBA-AI) |