MetaOthello: Pretrained Models & Board Probes

Pretrained GPT checkpoints and linear board probes for MetaOthello, a controlled suite of Othello game variants used to study how transformers organize multiple world models internally.

Paper: MetaOthello: A Controlled Study of Multiple World Models in Transformers

Code: github.com/aviralchawla/metaothello

Training Data: datasets/aviralchawla/metaothello

Repository Contents

GPT Model Checkpoints

Seven trained GPT models β€” four single-game and three mixed-game β€” each with checkpoints at epochs 1, 5, 50, 150, and 250:

Directory Training Data Description
classic/ Classic (20M) Standard Othello
nomidflip/ NoMidFlip (20M) Flip endpoints only
delflank/ DelFlank (20M) Delete flanked pieces
iago/ Iago (20M) Classic with scrambled token vocabulary
classic_nomidflip/ Classic + NoMidFlip (20M each) Mixed: high rule overlap
classic_delflank/ Classic + DelFlank (20M each) Mixed: low rule overlap
classic_iago/ Classic + Iago (20M each) Mixed: isomorphic control

Architecture (all models): 8 layers, d_model=512, 8 attention heads, vocabulary size 66, context window 59 tokens.

Board Probe Checkpoints

Linear probes trained to predict tile state (Mine / Opponent / Empty) from residual-stream activations at each layer. Located under board_probes/{run_name}/:

Probe directory Probes Description
board_probes/classic/ 8 One per layer, trained on Classic data
board_probes/nomidflip/ 8 One per layer, trained on NoMidFlip data
board_probes/delflank/ 8 One per layer, trained on DelFlank data
board_probes/iago/ 8 One per layer, trained on Iago data
board_probes/classic_nomidflip/ 16 8 Classic + 8 NoMidFlip probes
board_probes/classic_delflank/ 16 8 Classic + 8 DelFlank probes
board_probes/classic_iago/ 16 8 Classic + 8 Iago probes

Naming convention: {game}_board_L{layer}.ckpt (layers 1–8).

Coming soon: Game ID probes (linear classifiers that predict which game variant is being played), along with training and plotting scripts, are currently in development and will be uploaded in a future update.

File Layout

aviralchawla/metaothello/
β”œβ”€β”€ classic/
β”‚   β”œβ”€β”€ epoch_1.ckpt          # ~101 MB each
β”‚   β”œβ”€β”€ epoch_5.ckpt
β”‚   β”œβ”€β”€ epoch_50.ckpt
β”‚   β”œβ”€β”€ epoch_150.ckpt
β”‚   └── epoch_250.ckpt
β”œβ”€β”€ nomidflip/                # same structure
β”œβ”€β”€ delflank/                 # same structure
β”œβ”€β”€ iago/                     # same structure
β”œβ”€β”€ classic_nomidflip/        # same structure
β”œβ”€β”€ classic_delflank/         # same structure
β”œβ”€β”€ classic_iago/             # same structure
└── board_probes/
    β”œβ”€β”€ classic/
    β”‚   β”œβ”€β”€ classic_board_L1.ckpt    # ~396 KB each
    β”‚   └── ...through L8
    β”œβ”€β”€ classic_nomidflip/
    β”‚   β”œβ”€β”€ classic_board_L1.ckpt    # ...through L8
    β”‚   └── nomidflip_board_L1.ckpt  # ...through L8
    └── ...                          # other runs

Usage

Download via the MetaOthello CLI (recommended)

# Clone the repository
git clone https://github.com/aviralchawla/metaothello.git
cd metaothello && pip install -e .

# Download all pretrained assets (models + data + board probes)
make download-all

# Download all GPT models
make download-models

# Download all board probes
make download-probes

# Download selectively
make download-model RUN_NAME=classic               # Single model
make download-board-probe RUN_NAME=classic_iago     # Single run's probes

Models are placed into data/{run_name}/ckpts/ and probes into data/{run_name}/board_probes/.

Download with huggingface_hub

from huggingface_hub import snapshot_download

# Download a single model's final checkpoint
snapshot_download(
    repo_id="aviralchawla/metaothello",
    repo_type="model",
    allow_patterns=["classic/epoch_250.ckpt"],
    local_dir="./data",
)

# Download all board probes for a run
snapshot_download(
    repo_id="aviralchawla/metaothello",
    repo_type="model",
    allow_patterns=["board_probes/classic_nomidflip/*.ckpt"],
    local_dir="./data",
)

Load a model

from metaothello.mingpt.utils import load_model_from_ckpt

# Load as a minGPT model
model = load_model_from_ckpt("data/classic/ckpts/epoch_250.ckpt", vocab_size=66, block_size=59)

# Load as a TransformerLens HookedTransformer (for mechanistic interpretability)
model = load_model_from_ckpt(
    "data/classic/ckpts/epoch_250.ckpt", vocab_size=66, block_size=59, as_tlens=True
)

Training From Scratch

To retrain models or probes, see the full instructions in the GitHub README.

Note: Board probe training requires caching residual-stream activations locally (>1 TB total across all models), which is why only the trained probe checkpoints (~396 KB each) are hosted here rather than the cached activations. Pre-trained probes are sufficient for all analysis scripts in the repository.

Citation

@article{metaothello2025,
  title   = {MetaOthello: A Controlled Study of Multiple World Models in Transformers},
  author  = {Aviral Chawla, Galen Hall, Juniper Lovato},
  journal = {arXiv preprint},
  year    = {2025}
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support