How to use polish language F5 tts ?

#1
by ArkaDio81 - opened

How to use polish language F5 tts ?

When i trying its not using polish accents

Maybe it needs more fine-tuning or use better reference audio.

But how to use it? where to copy this file, how I can know it using your model and not the standard ?

You need to replace model in directory users/username/.cache/huggingface/hub or edit infer.py file

On the GUI, there is a radio button, and if you select Custom, there are two lists: MODEL CKPT and VOCAB FILE. Here, you can enter paths.

Is it just me or does he not recognize Polish letters?
By the way, the first time I approached F5, there were only two languages, how did you train the Polish model?

It only recognises lowercase letters sorry you need to use better tokenizer not only chars. U can use ipa tokenizer and espeak-ng and train multilingual tts. It's just a test model.

In multi model eg. multi3 its producing nothing or some garbage. I guess cuz its multi. How do I make it know I want pl->pl?
q1.png
q2.png

Hello,

I have tried to use your model in Alltalk, but Alltalk could not see the model. When i have copied "vocos" folder with those files: config.yaml and pytorch_model.bin
Alltalk could load model, but generated voice is garbage.

Is it possible to share files beacuse those from original F5 are not correct for polish language.

@Gregniuki

this is output for MULTi model for polish text: cześć, jestem modelem F5-TTS.
Output components:
[]
Output values returned:
["Speed set to: 1.0"]
warnings.warn(
Cześć, jestem modelem F5-TTS.
text: 295
ref_text Tylko tyle znaleźli? Nie, głowa jest na miejscu pasażera. Jeśli jesteś zbyt zrzędliwy, żeby odpowiedzieć, po prostu to powiedz.
gen_text 0 cześć, jestem modelem f5-tts.
0%| | 0/1 [00:00<?, ?it/s]pl
pl
tʃˈɛɕtɕ, jˌɛstɛm mɔdˈɛlɛm ˈɛf pʲˈɛɲtɕtˌɛtˌɛˈɛs.
tˌɨlkɔ tˈɨlɛ znalˈɛʑli? ɲʲɛ, ɡwˈɔva jɛst na mʲˈɛjstsu pˌasaʒˈɛra. jˈɛɕli jˌɛstɛʑ zbˈɨd zʒɛndlˈivɨ, ʒˈɛbɨ ˌɔtpɔvʲˈɛdʑɛtɕ, pɔ prˈɔstu tɔ pˈɔvʲɛts. tʃˈɛɕtɕ, jˌɛstɛm mɔdˈɛlɛm ˈɛf pʲˈɛɲtɕtˌɛtˌɛˈɛs.
['tˌɨlkɔ tˈɨlɛ znalˈɛʑli? ɲʲɛ, ɡwˈɔva jɛst na mʲˈɛjstsu pˌasaʒˈɛra. jˈɛɕli jˌɛstɛʑ zbˈɨd zʒɛndlˈivɨ, ʒˈɛbɨ ˌɔtpɔvʲˈɛdʑɛtɕ, pɔ prˈɔstu tɔ pˈɔvʲɛts. tʃˈɛɕtɕ, jˌɛstɛm mɔdˈɛlɛm ˈɛf pʲˈɛɲtɕtˌɛtˌɛˈɛs. ']
Chunk 1: Duration: 725 speed 1.0
100%|██████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.72s/it]
C:\Users\lukas\miniconda3\Lib\site-packages\gradio\processing_utils.py:688: UserWarning: Trying to convert audio automatically from float16 to 16-bit int format.
warnings.warn(warning.format(data.dtype))

even when i replace model in users/username/.cache/huggingface/hub to use polish sound is a garbage...

can you advice how to fix this>

Sign up or log in to comment