kunhunjon's picture
Upload ChessLM Qwen3 Neuron model in AWS format structure
bdbfdea verified

Weights Information

This model contains weights bundled within model.pt (17GB).

In the AWS Neuron reference format, weights are typically stored separately as:

  • weights/tp0_sharded_checkpoint.safetensors
  • weights/tp1_sharded_checkpoint.safetensors

To extract weights to safetensors format, you would need to:

  1. Load the model using optimum-neuron
  2. Extract the state_dict
  3. Convert to safetensors format
  4. Shard by tensor parallel rank

This is currently not straightforward for compiled Neuron models as the weights are embedded in the compiled format.

Current Structure

The model.pt file contains:

  • Compiled graphs (NEFF format)
  • Model weights (optimized for Neuron)
  • Runtime metadata

The separate directories contain:

  • context_encoding_model/: NEFF files for context encoding
  • token_generation_model/: NEFF files for token generation
  • layout_opt/: Layout optimization artifacts

##Usage

Load this model using:

from optimum.neuron import NeuronModelForCausalLM
model = NeuronModelForCausalLM.from_pretrained("path/to/model")