Spaces:
Running
Running
| title: Open Asr Leaderboard CL | |
| emoji: π₯ | |
| colorFrom: green | |
| colorTo: indigo | |
| sdk: gradio | |
| app_file: app.py | |
| pinned: true | |
| license: apache-2.0 | |
| short_description: Open ASR Leaderboard for Chilean Spanish | |
| sdk_version: 4.44.0 | |
| tags: | |
| - leaderboard | |
| # Chilean Spanish ASR Leaderboard | |
| > **Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.** | |
| ## Quick Start | |
| This is a simplified version that displays results from a CSV file with two tabs: | |
| - **π Chilean Spanish ASR Leaderboard**: Shows model rankings based on WER and RTFx metrics | |
| - **π About**: Detailed information about the evaluation methodology and datasets | |
| ### Running the Leaderboard | |
| ```bash | |
| # Clone the repository | |
| git clone https://github.com/aastroza/open_asr_leaderboard_cl.git | |
| cd open_asr_leaderboard_cl | |
| # Install dependencies | |
| pip install gradio pandas | |
| # Run the application | |
| python app.py | |
| ``` | |
| The application will load results from `results.csv` and display them in a simple, clean interface. | |
| ### Results Format | |
| The `results.csv` file should contain the following columns: | |
| - `model_id`: The model identifier (e.g., "openai/whisper-large-v3") | |
| - `wer`: Word Error Rate (lower is better) | |
| - `rtfx`: Real-Time Factor (higher is better) | |
| - Additional metadata columns (dataset, num_samples, etc.) | |
| ### Configuration | |
| - **Title and Content**: Edit `src/about.py` to modify the title, introduction text, and about section | |
| - **Styling**: Customize appearance in `src/display/css_html_js.py` | |
| - **Data Processing**: Modify the `load_results()` function in `app.py` to change how results are aggregated and displayed | |
| ## About the Evaluation | |
| This leaderboard evaluates ASR models on Chilean Spanish using three datasets: | |
| - **Common Voice** (Chilean Spanish subset) | |
| - **Google Chilean Spanish** | |
| - **Datarisas** | |
| Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed. | |
| ## Models Evaluated | |
| - openai/whisper-large-v3 | |
| - openai/whisper-large-v3-turbo | |
| - openai/whisper-small | |
| - rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned) | |
| - nvidia/canary-1b-v2 | |
| - nvidia/parakeet-tdt-0.6b-v3 | |
| - microsoft/Phi-4-multimodal-instruct | |
| - mistralai/Voxtral-Mini-3B-2507 | |
| - elevenlabs/scribe_v1 | |
| For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository. | |
| ## Citation | |
| ```bibtex | |
| @misc{astroza2024chilean, | |
| title={Chilean Spanish ASR Test Dataset}, | |
| author={Alonso Astroza}, | |
| year={2025}, | |
| howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}} | |
| } | |
| ``` |