gpt-oss-20b-reap-0.4-bf16

This repository contains a bfloat16 version of the sandeshrajx/gpt-oss-20b-reap-0.4-mxfp4 model.

Model Description

This model is a bfloat16 version of the MXFP4 quantized openai/gpt-oss-20b model.

  • Original Model: openai/gpt-oss-20b
  • Pruning Method: reap with a compression ratio of 0.4
  • Original Quantization Method: MXFP4 weight-only quantization
  • Current Format: bfloat16
  • Dataset used for pruning/quantization (if applicable): theblackcat102/evol-codealpaca-v1

Usage

You can load this model using the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "sandeshrajx/gpt-oss-20b-reap-0.4-bf16"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)

# Use the model for inference
# ...

License

(Please specify the license of the original model and any modifications)

Downloads last month
21
Safetensors
Model size
14B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sandeshrajx/gpt-oss-20b-reap-0.4-bf16

Quantizations
1 model

Evaluation results