Spaces:

idsudd
/

open_asr_leaderboard_cl

Running

App Files Files Community

open_asr_leaderboard_cl / README.md

astroza

Update leaderboard configuration and results processing for Chilean Spanish ASR evaluation

13a06cd about 1 month ago

preview code

raw

history blame contribute delete

2.63 kB

	---
	title: Open Asr Leaderboard CL
	emoji: 🥇
	colorFrom: green
	colorTo: indigo
	sdk: gradio
	app_file: app.py
	pinned: true
	license: apache-2.0
	short_description: Open ASR Leaderboard for Chilean Spanish
	sdk_version: 4.44.0
	tags:
	- leaderboard
	---

	# Chilean Spanish ASR Leaderboard

	> Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.

	## Quick Start

	This is a simplified version that displays results from a CSV file with two tabs:
	- 🏅 Chilean Spanish ASR Leaderboard: Shows model rankings based on WER and RTFx metrics
	- 📝 About: Detailed information about the evaluation methodology and datasets

	### Running the Leaderboard

	```bash
	# Clone the repository
	git clone https://github.com/aastroza/open_asr_leaderboard_cl.git
	cd open_asr_leaderboard_cl

	# Install dependencies
	pip install gradio pandas

	# Run the application
	python app.py
	```

	The application will load results from `results.csv` and display them in a simple, clean interface.

	### Results Format

	The `results.csv` file should contain the following columns:
	- `model_id`: The model identifier (e.g., "openai/whisper-large-v3")
	- `wer`: Word Error Rate (lower is better)
	- `rtfx`: Real-Time Factor (higher is better)
	- Additional metadata columns (dataset, num_samples, etc.)

	### Configuration

	- Title and Content: Edit `src/about.py` to modify the title, introduction text, and about section
	- Styling: Customize appearance in `src/display/css_html_js.py`
	- Data Processing: Modify the `load_results()` function in `app.py` to change how results are aggregated and displayed

	## About the Evaluation

	This leaderboard evaluates ASR models on Chilean Spanish using three datasets:
	- Common Voice (Chilean Spanish subset)
	- Google Chilean Spanish
	- Datarisas

	Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed.

	## Models Evaluated

	- openai/whisper-large-v3
	- openai/whisper-large-v3-turbo
	- openai/whisper-small
	- rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned)
	- nvidia/canary-1b-v2
	- nvidia/parakeet-tdt-0.6b-v3
	- microsoft/Phi-4-multimodal-instruct
	- mistralai/Voxtral-Mini-3B-2507
	- elevenlabs/scribe_v1

	For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository.

	## Citation

	```bibtex
	@misc{astroza2024chilean,
	title={Chilean Spanish ASR Test Dataset},
	author={Alonso Astroza},
	year={2025},
	howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}}
	}
	```