IsmatS's picture
Upload README.md with huggingface_hub
eba9c9b verified
---
language:
- en
license: mit
tags:
- handwriting-recognition
- ocr
- computer-vision
- pytorch
- crnn
- ctc
- iam-dataset
library_name: pytorch
datasets:
- Teklia/IAM-line
metrics:
- cer
- wer
model-index:
- name: handwriting-recognition-iam
results:
- task:
type: image-to-text
name: Handwriting Recognition
dataset:
name: IAM Handwriting Database
type: Teklia/IAM-line
metrics:
- type: cer
value: 0.1295
name: Character Error Rate
- type: wer
value: 0.4247
name: Word Error Rate
---
# Handwriting Recognition
Complete handwriting recognition system using CNN-BiLSTM-CTC on the IAM dataset.
## πŸ“ Files
### 1. **analysis.ipynb** - Dataset Analysis
- Exploratory Data Analysis (EDA)
- 5 detailed charts saved to `charts/` folder
- Run locally or on Colab (no GPU needed)
### 2. **train_colab.ipynb** - Model Training (GPU)
- **⚑ Google Colab GPU compatible**
- Full training pipeline
- CNN-BiLSTM-CTC model (~9.1M parameters)
- Automatic model saving
- Download trained model for deployment
## πŸš€ Quick Start
### Option 1: Analyze Dataset (Local/Colab)
```bash
jupyter notebook analysis.ipynb
```
- No GPU needed
- Generates 5 EDA charts
- Fast (~2 minutes)
### Option 2: Train Model (Google Colab GPU)
1. **Upload `train_colab.ipynb` to Google Colab**
2. **Change runtime to GPU:**
- Runtime β†’ Change runtime type β†’ GPU (T4 recommended)
3. **Run all cells**
4. **Download trained model** (last cell)
**Training Time:** ~1-2 hours for 20 epochs on T4 GPU
## πŸ“Š Charts Generated
From `analysis.ipynb`:
1. `charts/01_sample_images.png` - 10 sample handwritten texts
2. `charts/02_text_length_distribution.png` - Text statistics
3. `charts/03_image_dimensions.png` - Image analysis
4. `charts/04_character_frequency.png` - Character distribution
5. `charts/05_summary_statistics.png` - Summary table
## 🎯 Model Details
**Architecture:**
- **CNN**: 7 convolutional blocks (feature extraction)
- **BiLSTM**: 2 layers, 256 hidden units (sequence modeling)
- **CTC Loss**: Alignment-free training
**Dataset:** Teklia/IAM-line (Hugging Face)
- Train: 6,482 samples
- Validation: 976 samples
- Test: 2,915 samples
**Metrics:**
- **CER** (Character Error Rate)
- **WER** (Word Error Rate)
## πŸ’Ύ Model Files
After training in Colab:
- `best_model.pth` - Trained model weights
- `training_history.png` - Loss/CER/WER plots
- `predictions.png` - Sample predictions
## πŸ“¦ Requirements
```
torch>=2.0.0
datasets>=2.14.0
pillow>=9.5.0
numpy>=1.24.0
matplotlib>=3.7.0
seaborn>=0.13.0
jupyter>=1.0.0
jiwer>=3.0.0
```
## πŸ”§ Usage
### Load Trained Model
```python
import torch
# Load checkpoint
checkpoint = torch.load('best_model.pth')
char_mapper = checkpoint['char_mapper']
# Create model
from train_colab import CRNN # Copy model class
model = CRNN(num_chars=len(char_mapper.chars))
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
# Predict
# ... (preprocessing + inference)
```
## πŸ“ Notes
- **GPU strongly recommended** for training (use Colab T4)
- Training on CPU will be extremely slow (~20x slower)
- Colab free tier: 12-hour limit, sufficient for 20 epochs
- Model checkpoint includes character mapper for deployment
## πŸŽ“ Training Tips
1. **Start with fewer epochs** (5-10) to test
2. **Monitor CER/WER** - stop if not improving
3. **Increase epochs** if still improving (up to 50)
4. **Save checkpoint** before Colab disconnects
5. **Download model immediately** after training
## πŸ“„ License
Dataset: IAM Database (research use)