File size: 2,625 Bytes
b9cf8c8
13a06cd
b9cf8c8
 
 
 
 
 
 
13a06cd
 
b9cf8c8
309f2b5
b9cf8c8
 
13a06cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b9cf8c8
 
13a06cd
 
 
b9cf8c8
13a06cd
 
 
 
 
b9cf8c8
13a06cd
b9cf8c8
13a06cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: Open Asr Leaderboard CL
emoji: πŸ₯‡
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Open ASR Leaderboard for Chilean Spanish
sdk_version: 4.44.0
tags:
- leaderboard
---

# Chilean Spanish ASR Leaderboard

> **Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.**

## Quick Start

This is a simplified version that displays results from a CSV file with two tabs:
- **πŸ… Chilean Spanish ASR Leaderboard**: Shows model rankings based on WER and RTFx metrics
- **πŸ“ About**: Detailed information about the evaluation methodology and datasets

### Running the Leaderboard

```bash
# Clone the repository
git clone https://github.com/aastroza/open_asr_leaderboard_cl.git
cd open_asr_leaderboard_cl

# Install dependencies
pip install gradio pandas

# Run the application
python app.py
```

The application will load results from `results.csv` and display them in a simple, clean interface.

### Results Format

The `results.csv` file should contain the following columns:
- `model_id`: The model identifier (e.g., "openai/whisper-large-v3")
- `wer`: Word Error Rate (lower is better)
- `rtfx`: Real-Time Factor (higher is better)
- Additional metadata columns (dataset, num_samples, etc.)

### Configuration

- **Title and Content**: Edit `src/about.py` to modify the title, introduction text, and about section
- **Styling**: Customize appearance in `src/display/css_html_js.py`
- **Data Processing**: Modify the `load_results()` function in `app.py` to change how results are aggregated and displayed

## About the Evaluation

This leaderboard evaluates ASR models on Chilean Spanish using three datasets:
- **Common Voice** (Chilean Spanish subset)
- **Google Chilean Spanish** 
- **Datarisas**

Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed.

## Models Evaluated

- openai/whisper-large-v3
- openai/whisper-large-v3-turbo  
- openai/whisper-small
- rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned)
- nvidia/canary-1b-v2
- nvidia/parakeet-tdt-0.6b-v3
- microsoft/Phi-4-multimodal-instruct
- mistralai/Voxtral-Mini-3B-2507
- elevenlabs/scribe_v1

For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository.

## Citation

```bibtex
@misc{astroza2024chilean,
  title={Chilean Spanish ASR Test Dataset},
  author={Alonso Astroza},
  year={2025},
  howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}}
}
```