Audio Spaces
-
📈71
-
Seamless M4T
📞950 -
MusicGen
🎵5.07kGenerate music from text descriptions and optional melodies
-
Audioldm Text To Audio Generation
🔊812Generate audio from text descriptions
-
AudioLDM2 Text2Audio Text2Music Generation
🔊307Generate audio and waveform video from text
-
AudioSep
🐠222 -
Lp Music Caps
🎵170Generate captions for music audio
-
Tortoise Tts
🐢313ExpressivText-to-Speech
-
All In One
📊22 -
XTTS
🐸2.77kGenerate speech from text using a reference voice
-
Coqui Bark Voice Cloning
🐸189 -
VALL E X
🎙365Generate audio from text using voice prompts
-
WavJourney
🔥193 -
Music To Image
🎶264 -
MMS
🌍277Transform and identify speech with MMS
-
ElevenLabs TTS
🗣614Generate voice from text using ElevenLabs
-
AudioGPT
🚀289 -
Bark
🐶2.37kGenerate realistic audio from text
-
SpeechT5 Speech Recognition Demo
👩36 -
CoquiTTS (Official)
🐸173 -
Whisper
📉2.63kTranscribe audio files or YouTube videos into text
-
Moe TTS
😊658Generate and convert voice using text and audio inputs
-
YourTTS
🔥17 -
Talking Face Generation with Multilingual TTS
👄557Generate a talking face video from text in multiple languages
-
OpenAI TTS New
📊562 -
Mustango
🐢167 -
OWSM Demo
🔊55 -
StyleTTS 2
🗣709Efficient, fast, and natural text to speech with StyleTTS 2!
-
HierSpeech++ (Zero-shot TTS)
⚡399Generate high-quality speech from text using a prompt audio
-
Video2music
📚21Generate music for a video based on its content and key
-
Whisper Large V2
🤫188 -
Musicgen Prompt Upsampling
🌖64Generate music from text prompts 🎶
-
Seamless M4T v2
📞516Translate speech and text between languages
-
Seamless Streaming
📞319Translate text between languages
-
Matcha TTS
🍵53Generate speech from text with speaker selection
-
MusicGen Streaming
🔥279Generate music from text prompts
-
Resemble Enhance
🚀427Enhance and denoise your audio files
-
Singing Voice Conversion
🎼261Transform your voice into a singer's
-
NaturalSpeech2
🎧52Generate speech with cloned timbre
-
Create Your Own TTS Dataset
🔥22 -
Podcast Transcription
🐢 -
OpenVoice
🤗1.12kGenerate voice from text using a reference audio
-
M2UGen Demo
💻94 -
Pheme
📊68 -
ESPnet2 TTS
📈6Convert text to speech in English, Chinese, or Japanese
-
Whisper-WebUI
🚀38Generate subtitles and translate audio files
-
Image2SFX Comparison
👂174Generates audio environment from an image
-
WhisperSpeech
🌬379 -
MetaVoice 1B
🗣144A demo of MetaVoice 1B, a new TTS model by MetaVoice.
-
TTS Arena V2
🏆906Vote on the latest TTS models!
-
Whisper Speech X DreamTalk
😽175Combine voice cloning and portrait lipsync animation
-
Canary 1b
🐤197Transcribe and translate audio into text
-
SALMONN Audio Questioning
⚡82Deeply interrogate audio file content
-
MeloTTS
🗣468Fast, efficient, & multilingual text-to-speech
-
Audio Editing
🎧312Edit audios with text prompts
-
ChatMusician
💻18 -
xVASynth TTS
🧝73CPU powered, low RTF, emotional, multilingual TTS
-
NaturalSpeech3 FACodec
🏃180Convert and reconstruct speech files
-
Hey Gemma
☎25 -
Ratchet + Whisper
🗣70Convert audio to text
-
AutoSubs
📜3Automatically add on-screen subs to your videos
-
VoiceCraft
📈161 -
TangoFlux
🚀322Text to Audio (Sound SFX) Generator
-
Parler-TTS
🥖834High-fidelity Text-To-Speech
-
Sing an idea ➡️ Music
🔥184Bring song ideas to life
-
Musicgen Songstarter Demo
👁75Generate music using descriptions and optional melody audio
-
Whisper JAX
👀145Transcribe or translate audio from microphone, file, or YouTube
-
AudioLCM
🏢22Generate audio from text
-
Stable Audio Live Multiplayer
💻160Generate audio from text prompts
-
Stable Audio Open Zero
🔥449Generate audio from text prompts
-
Make An Audio 3
🐠14Generate audio from text prompts
-
Mars5 Space
📉60 -
Tango Music AF
🎵5Text to Music Generator
-
Jam
🐠16Generate a song from lyrics and style reference
-
BigVGAN
🔊108Generate high-quality audio from input audio
-
SenseVoice
🐠89Transcribe audio with emotions and events
-
PicoAudio
📈28Generate audio from text descriptions with timestamps
-
Audio Flamingo Demo
📚7 -
MusiConGen
🪩29 -
Mms Zeroshot
🌍20Transcribe audio in any language using text data
-
GPT SoVITS V2 Pro Plus
🤗205Generate speech from text using reference audio
-
EzAudio
🟣275Generate and edit audio from text prompts
-
OpenMusic
🎶214Generate music from text descriptions
-
Midi Music Generator
🎼550Generate MIDI music from prompts
-
Whisper Turbo
🤯991Transcribe audio or YouTube videos into text
-
Realtime Whisper Turbo
🤯341Realtime implementation of Whisper large turbo
-
Whisper Large V3 Turbo WebGPU
🚀167ML-powered speech recognition directly in your browser
-
OpenAudio S1
🏆664Generate speech from text
-
TTS Spaces Arena
🤗451Blind vote on HF TTS models!
-
Diva Realtime Chat
🗣19Generate text responses from audio input
-
F5-TTS
🗣2.71kF5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
-
MaskGCT TTS Demo
😻260MaskGCT TTS Demo
-
MelodyFlow
🎵139Generate music from text descriptions
-
Fish Agent
💬147An end-to-end (e2e) Voice Language Model by Fish Audio.
-
Nexa Omni Demo
🎧64Generate text from audio input
-
Kokoro TTS
❤3.06kUpgraded to v1.0!
-
Make Custom Voices With KokoroTTS
⚡124Make Custom Voices With KokoroTTS
-
Llasa 3b Tts
🔥311Zero Shot voice cloning with llasa 3b (Unofficial Demo)
-
Llasa 1b Multilingual TTS
🌍12Generate speech from text with or without cloning a voice
-
Kokoro Text-to-Speech (WebGPU)
🗣347High-quality speech synthesis powered by Kokoro TTS
-
Hibiki Simple
👄42High-Fidelity Simultaneous Speech-To-Speech Translation
-
Zonos
🌍410Generate audio from text with customizable emotions and settings
-
Kokoro Web
🗣77ML-powered speech synthesis directly in your browser
-
Di♪♪Rhythm
🎶657Blazingly Fast and Embarrassingly Simple Song Generation
-
Audiobox Aesthetics
📚22Demo for audiobox-aesthetics
-
Spark TTS
🌖229A text-to-speech model powered by SparkAudio and Mobvoi.
-
Sesame CSM
🌱852Conversational speech generation
-
Orpheus TTS
🚀238Try Orpheus TTS here
-
Canary 1B Flash
🐤43Canary 1B Flash demo
-
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
🎙216Generate speech from text using a reference audio
-
AudioMorphix
🌊6Prepare environment and run Gradio app
-
MegaTTS3 Demo
👋93 -
AudioX
👀158Generate audio from text and video prompts
-
Vevo for Zero-shot VC, TTS, and More
🐠100Controllable Zero-Shot Voice Imitation
-
Dia 1.6B
👯1.72kGenerate realistic dialogue from a script, using Dia!
-
Aero 1 Audio Demo
💬43Demo for Aero-1-Audio
-
Voila Demo
💻43Chat with a voice-clone AI
-
ACE Step
😻602A Step Towards Music Generation Foundation Model
-
Audio Difficulty Estimator
🎹2Estimate piano difficulty from audio
-
TIGER Audio Extractor
✂108Extraction & Reconstruction for Efficient Speech Separation
-
Music2emo
📊15Towards Unified Music Emotion Recognition across Dimensional
-
SonicVerse
🖼13Generate detailed music descriptions from audio clips
-
Auffusion
😻41Audio Gen, Audio Style Transfer and Audio InPainting
-
Chatterbox TTS
🍿1.63kExpressive Zeroshot TTS
-
PlayDiffusion
🎨118Generate modified audio from text and voice
-
Voice Clone Arena
🏆2Vote on the latest Voice Clone TTS models!
-
Conversational WebGPU
🚀227 -
Song Generation
🎵524Generate a custom song from lyrics and optional prompts
-
NotaGen
📊56Generate classical sheet music in ABC notation
-
Audio Flamingo 3 Demo
🚀85Audio Flamingo 3 Demo
-
Audio Flamingo 3 Chat
🐠32Audio Flamingo 3 demo for multi-turn multi-audio chat
-
MSR UTMOS
🐢6Multiple sampling rate MOS prediction with SFI conv
-
Higgs Audio Demo
🎤392Higgs Audio Demo
-
sidon_demo_beta
🐋18Speech restoration demo of Sidon.
-
Canary 1b V2
🐤67Transcribe and Translate in 25 European Languages
-
SonicMaster – Text-Guided Music Restoration & Mastering
🎧18Enhance audio using text prompts
-
OLMoASR
🌍6Open Models and Data for Training Robust Speech Recognition
-
VibeVoice-Large
🏃85Generate a podcast audio from a script and voice samples
-
TaDiCodec TTS AR Qwen2.5 0.5B
📚10Generate speech from text with voice cloning
-
EchoX
🔥8An end-to-end speech large language model.
-
VoxCPM 0.5B
🐢43Generate expressive speech from text with optional voice cloning
-
FireRedTTS2
🔥35Long-form multi-speaker dialogue generation
-
FireRedASR
🚀4FireRedASR Demo
-
IndexTTS 2 Demo
🏢578Generate expressive speech from text with emotion control
-
SongFormer
🎵13State-of-the-art music analysis with multi-scale datasets
-
Voice Acting TTS
🎭19TTS for any emotion, now with non-verbal sounds!
-
Omnilingual ASR Media Transcription
🌍201Transcribe audio or video into text in multiple languages
-
Music Flamingo
🎵65Upload audio or provide a YouTube URL to get detailed music insights
-
Maya1
📉110Demo of our new open source model maya1
-
Supertonic (TTS)
⚡193Lightning-Fast, On-Device TTS
-
Dia2 2B
💨64Streaming conversational audio in realtime
-
VibeVoice-Realtime-0.5B
🐨27Generate speech from text