sato2ru
/

wordle-solver

Model card Files Files and versions

xet

Community

sato2ru commited on Mar 8

Commit

0028132

verified ·

1 Parent(s): be40c77

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +191 -0

README.md ADDED Viewed

	@@ -0,0 +1,191 @@

+---
+language: en
+tags:
+  - wordle
+  - pytorch
+  - reinforcement-learning
+  - supervised-learning
+  - game-ai
+  - nlp
+license: mit
+---
+# 🟩 Wordle AI Solver
+Neural network models for solving Wordle puzzles. This repo contains two models — a supervised baseline and a reinforcement learning variant — both deployable via the [live app](https://wordle-solver-tan.vercel.app).
+---
+## Files
+| File | Description |
+|------|-------------|
+| `model_weights.pt` | Supervised model (WordleNet) |
+| `config.json` | Supervised model config |
+| `rl_model_weights.pt` | RL model (REINFORCE-filtered) |
+| `rl_config.json` | RL model config |
+| `answers.json` | 2,315 valid Wordle answers |
+| `allowed.json` | 12,972 valid guess words |
+---
+## Model Comparison
+| | 🧠 Supervised | 🤖 Reinforcement |
+|---|---|---|
+| **Training method** | CrossEntropy on entropy-optimal games | REINFORCE with elite game filtering |
+| **Win rate** | 100% | 98.2% |
+| **Avg guesses** | 3.46 | 3.75 |
+| **Opener** | CRANE | CRANE |
+| **Parameters** | ~13M | ~13M |
+---
+## Architecture
+Both models share the same encoder:
+```
+Input:  390-dim binary vector
+        (26 letters × 5 positions × 3 states: grey/yellow/green)
+Hidden: Linear(390 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
+        Linear(512 → 512) → BatchNorm1d → ReLU → Dropout(0.3)
+        Linear(512 → 256) → BatchNorm1d → ReLU
+Output: Linear(256 → 12972)
+        logits over all 12,972 allowed guess words
+```
+Board encoding:
+```python
+vec[letter_index * 15 + position * 3 + state] = 1.0
+# letter_index: 0-25 (a-z)
+# position:     0-4
+# state:        0=grey, 1=yellow, 2=green
+```
+---
+## Training
+### Supervised Model
+Trained on ~10,000 (board_state, best_guess) pairs generated by an entropy-optimal solver that plays all 2,315 Wordle games. The solver picks the guess maximising expected information gain at each step:
+$$E[\text{Info}] = \sum_{p} P(p) \cdot \log_2\left(\frac{1}{P(p)}\right)$$
+### RL Model
+1. **Warm start** from supervised weights
+2. **Elite game collection** — greedy rollouts with constraint-filtered action masking, keeping only games solved in ≤3 guesses (~11% hit rate)
+3. **REINFORCE training** — supervised loss on elite (state, action) pairs
+4. **Benchmark** against all 2,315 answers using constraint-filtered suggestion logic
+The RL model learns purely from reward signal (win/lose, guesses used) without access to the entropy oracle used to train the supervised model.
+---
+## Inference
+The models are not used as raw classifiers — the backend combines model logits with constraint filtering:
+```python
+# 1. Get top-20 model words
+logits = model(encode_board(history))
+model_words = [ALLOWED[i] for i in logits.topk(20).indices]
+# 2. Filter to words consistent with all previous guesses
+possible = filter_words(ANSWERS, history)
+# 3. Score by entropy against remaining possible set
+candidates = model_words + possible
+best = max(candidates, key=lambda w: entropy_score(w, possible))
+```
+This hybrid approach is why the supervised model achieves 100% — the neural net narrows the search, entropy scoring picks the optimal move.
+---
+## Usage
+```python
+import torch
+import torch.nn as nn
+from huggingface_hub import hf_hub_download
+import json
+REPO_ID = "sato2ru/wordle-solver"
+config  = json.load(open(hf_hub_download(REPO_ID, "config.json")))
+ALLOWED = json.load(open(hf_hub_download(REPO_ID, "allowed.json")))
+class WordleNet(nn.Module):
+    def __init__(self):
+        super().__init__()
+        h = config["hidden"]
+        self.net = nn.Sequential(
+            nn.Linear(390, h), nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
+            nn.Linear(h, h),   nn.BatchNorm1d(h), nn.ReLU(), nn.Dropout(0.3),
+            nn.Linear(h, 256), nn.BatchNorm1d(256), nn.ReLU(),
+            nn.Linear(256, 12972)
+        )
+    def forward(self, x): return self.net(x)
+# Load supervised model
+model = WordleNet()
+model.load_state_dict(
+    torch.load(hf_hub_download(REPO_ID, "model_weights.pt"), map_location="cpu")
+)
+model.eval()
+```
+Or use the live API directly:
+```bash
+curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=supervised" \
+  -H "Content-Type: application/json" \
+  -d '{"history": []}'
+curl -X POST "https://web-production-ea1d.up.railway.app/suggest?model=rl" \
+  -H "Content-Type: application/json" \
+  -d '{"history": []}'
+```
+---
+## Results
+### Supervised — all 2,315 answers (greedy + entropy filter)
+```
+1 guess :    1
+2 guesses:   59  ████████████
+3 guesses: 1188  ██████████████████████████████████████████████
+4 guesses: 1010  ████████████████████████████████████████
+5 guesses:   56  ███████████
+6 guesses:    1
+FAILED   :    0  ✅ 100% win rate
+```
+### RL — all 2,315 answers (greedy + entropy filter)
+```
+1 guess :    1
+2 guesses:  141  ████████████
+3 guesses:  810  ██████████████████████████████████████████████
+4 guesses:  893  ████████████████████████████████████████
+5 guesses:  343  ███████████
+6 guesses:   86  ████
+FAILED   :   41  ✅ 98.2% win rate
+```
+---
+## Links
+- **Live App:** [wordle-solver-tan.vercel.app](https://wordle-solver-tan.vercel.app)
+- **GitHub:** [github.com/Jeanwrld/wordle-solver](https://github.com/Jeanwrld/wordle-solver)
+- **Backend:** [github.com/Jeanwrld/wordle-api](https://github.com/Jeanwrld/wordle-api)
+- **Gradio Demo:** [huggingface.co/spaces/sato2ru/wordle](https://huggingface.co/spaces/sato2ru/wordle)
+---
+## License
+MIT