Instructions to use ylacombe/mms-spa-finetuned-colombian-monospeaker with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ylacombe/mms-spa-finetuned-colombian-monospeaker with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="ylacombe/mms-spa-finetuned-colombian-monospeaker")# Load model directly from transformers import AutoTokenizer, AutoModelForPreTraining tokenizer = AutoTokenizer.from_pretrained("ylacombe/mms-spa-finetuned-colombian-monospeaker") model = AutoModelForPreTraining.from_pretrained("ylacombe/mms-spa-finetuned-colombian-monospeaker") - Transformers.js
How to use ylacombe/mms-spa-finetuned-colombian-monospeaker with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-to-speech', 'ylacombe/mms-spa-finetuned-colombian-monospeaker'); - Notebooks
- Google Colab
- Kaggle
Model
This is a finetuned version of the Spanish version of Massively Multilingual Speech (MMS) models, which are light-weight, low-latency TTS models based on the VITS architecture.
It was trained in around 20 minutes with as little as 80 to 150 samples, on this Colombian Spanish dataset.
Training recipe available in this github repository: ylacombe/finetune-hf-vits.
Usage
Transformers
from transformers import pipeline
import scipy
model_id = "ylacombe/mms-spa-finetuned-colombian-monospeaker"
synthesiser = pipeline("text-to-speech", model_id) # add device=0 if you want to use a GPU
speech = synthesiser("Hola, ¿cómo estás hoy?")
scipy.io.wavfile.write("finetuned_output.wav", rate=speech["sampling_rate"], data=speech["audio"])
Transformers.js
If you haven't already, you can install the Transformers.js JavaScript library from NPM using:
npm i @xenova/transformers
Example: Generate Spanish speech with ylacombe/mms-spa-finetuned-colombian-monospeaker.
import { pipeline } from '@xenova/transformers';
// Create a text-to-speech pipeline
const synthesizer = await pipeline('text-to-speech', 'ylacombe/mms-spa-finetuned-colombian-monospeaker', {
quantized: false, // Remove this line to use the quantized version (default)
});
// Generate speech
const output = await synthesizer('Hola, ¿cómo estás hoy?');
console.log(output);
// {
// audio: Float32Array(69888) [ ... ],
// sampling_rate: 16000
// }
Optionally, save the audio to a wav file (Node.js):
import wavefile from 'wavefile';
import fs from 'fs';
const wav = new wavefile.WaveFile();
wav.fromScratch(1, output.sampling_rate, '32f', output.audio);
fs.writeFileSync('out.wav', wav.toBuffer());
- Downloads last month
- 16