Steelman-14B-Ada v0.3 -- GGUF

Quantized GGUF of Steelman-14B-Ada v0.3 for use with Ollama, llama.cpp, or any GGUF-compatible runtime.

Download

File	Size	Quantization	Notes
`steelman-r7-coder-base-q8_0.gguf`	~15 GB	Q8_0 (8-bit)	Highest quality, recommended

Legacy GGUFs from prior rounds are kept for reproducibility but are superseded by the R7 version.

Benchmark

62.4% on Steelman Eval v4 (754 prompts, 10 categories, strict GNAT compilation + functional scoring) -- outperforms Claude Opus 4.6 (12.7%), GPT-5.4 (12.9%), and every other frontier model tested.

85.4% compile rate on HumanEval-Ada -- highest of any model tested, including all frontier models.

See the model card for full benchmark tables, per-category breakdown, and evaluation methodology.

Usage with Ollama

Download steelman-r7-coder-base-q8_0.gguf
Create a Modelfile:

FROM ./steelman-r7-coder-base-q8_0.gguf

TEMPLATE "Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{{ .Prompt }}

### Response:
{{ .Response }}"

SYSTEM "You are an expert Ada 2022 and SPARK programmer."

PARAMETER stop "### Instruction:"
PARAMETER temperature 0.0
PARAMETER num_ctx 32768

Create and run:

ollama create steelman -f Modelfile
ollama run steelman "Write an Ada procedure implementing a producer-consumer pattern with protected objects"

Important: This model uses an Alpaca template, not ChatML. Using the wrong template will severely degrade output quality.

Usage with llama.cpp

llama-cli -m steelman-r7-coder-base-q8_0.gguf \
  --temp 0 -n 2048 -c 32768 \
  -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Write an Ada 2022 generic package implementing a bounded stack with SPARK Pre/Post contracts.

### Response:
"

Performance

Q8_0 vs full precision: 0.5 percentage point difference on the 754-prompt eval (61.9% Q8_0 vs 62.4% fp16). Quantization has negligible impact on output quality.

Runs on any machine with 24GB+ RAM. GPU optional -- the model runs from system RAM via Ollama. On 64GB systems, the model stays in page cache for sub-second reloads between sessions.

License

Apache 2.0 (same as base model Qwen2.5-Coder-14B).

Downloads last month: 221

GGUF

Model size

15B params

Architecture

qwen2

Hardware compatibility

4-bit

8-bit

Model tree for the-clanker-lover/steelman-14b-ada-GGUF

Base model

Qwen/Qwen2.5-14B

Finetuned

Qwen/Qwen2.5-Coder-14B

Quantized

(21)

this model