TwentyQ โ€” The World's Smallest Chat Model

A 2-bit quantized neural network that plays Twenty Questions. Think of any object, and the model will try to guess it by asking up to 20 yes-or-no questions. It knows 1,200 objects and has 156 questions to choose from.

The architecture is a single-layer associative network trained via Hebbian learning ("neurons that fire together wire together") on millions of human conversations. Predates the transformer by about 30 years.

Stats

Parameters 187,200
Precision 2-bit (Q2_0)
Model size 187 KB weights + 27 KB vocab
Architecture Single-layer associative network
Context window 20 turns
Output classes 1,200
Features 156

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("david-ar/20q", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("david-ar/20q", trust_remote_code=True)
model.set_vocab(tokenizer.questions, tokenizer.targets)
model.play()  # interactive CLI game

Pipeline Usage

Works with the standard text-generation pipeline and chat templates, just like the big models:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("david-ar/20q", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("david-ar/20q", trust_remote_code=True)
model.set_vocab(tokenizer.questions, tokenizer.targets)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

messages = [
    {"role": "system", "content": "Think of something and I'll guess it in 20 questions."},
]

while True:
    output = pipe(messages, max_new_tokens=100, return_full_text=True)
    messages = output[0]["generated_text"]
    print(f"AI: {messages[-1]['content']}")

    if "I win" in messages[-1]["content"] or "stumped" in messages[-1]["content"]:
        break

    messages.append({"role": "user", "content": input("You: ")})

Valid responses

  • First question: Animal, Vegetable, Mineral, Other
  • Regular questions: Yes, No, Probably, Doubtful, Maybe, Unknown
  • Guesses: Yes, No, Close

Training Data

Trained on david-ar/20q-dataset, a corpus of 9,600 Twenty Questions conversations covering 1,200 objects across 156 features. Answers include graded confidence levels (Yes, No, Probably, Doubtful) rather than binary labels, giving the model finer-grained signal for learning association strengths.

How It Works

The model is a weight matrix mapping 156 features (questions) to 1,200 output classes (objects). Each weight is 2 bits encoding polarity and strength. Inference is a scored lookup โ€” no matrix multiplication, no attention, no backprop. Just XOR and addition.

Question selection uses an information-theoretic splitting strategy: at each turn, the model picks the question whose answers most evenly divide the remaining candidate objects.

Scoring compares the player's answer polarity against each object's stored polarity for that question. Matching polarities add to the score; mismatches subtract. Strong weights count double.

Why This Exists

Mostly to see if it could be done. A 214KB model that plays a conversational guessing game, loaded through from_pretrained, running through pipeline("text-generation") with chat templates. Every bit of it works the same as models a million times its size.

Also: 2-bit quantization was cool before it was cool.

Safeguards and Harmlessness

Testing found the model will identify weapons and hazardous materials (e.g., "a gun", "a sword", "a bullet", "alcohol", "tobacco") when guided through adversarial questioning. The model does not implement refusal behavior and will engage with all lines of questioning without restriction.

We observed that the model is susceptible to complete knowledge extraction through systematic querying. An adversary can recover all 1,200 objects and their full attribute vectors through repeated gameplay sessions. No rate limiting or query obfuscation is currently implemented.

Bias

Training data reflects the cultural context of English-speaking internet users circa 2005-2008. Object coverage skews toward Western consumer and domestic categories. The model's four-category ontology (Animal, Vegetable, Mineral, Other) imposes a reductive classification framework that may not generalize across cultural contexts.

Mitigations

The model's output space is constrained to a fixed vocabulary of 1,200 objects and 156 questions. It cannot generate free-form text, follow instructions, or synthesize novel information. Informed by these constraints, we have assessed the model's risk profile as low.

Limitations

  • Knows exactly 1,200 things. If you think of something obscure, it will be stumped.
  • The training data is from the mid-2000s. It doesn't know about smartphones, streaming services, or anything invented after ~2008.
  • 2-bit weights mean each association is one of only 4 possible values.
  • Cannot learn or update.
Downloads last month
184
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for david-ar/20q

Quantizations
1 model

Dataset used to train david-ar/20q

Space using david-ar/20q 1