kunhunjon
/

ChessLM_Qwen3_Trainium_AWS_Format

Text Generation

continuous-batching

Model card Files Files and versions

ChessLM_Qwen3_Trainium_AWS_Format / WEIGHTS_README.md

kunhunjon's picture

Upload ChessLM Qwen3 Neuron model in AWS format structure

bdbfdea verified 30 days ago

|

history blame contribute delete

1.08 kB

Weights Information

This model contains weights bundled within model.pt (17GB).

In the AWS Neuron reference format, weights are typically stored separately as:

weights/tp0_sharded_checkpoint.safetensors
weights/tp1_sharded_checkpoint.safetensors

To extract weights to safetensors format, you would need to:

Load the model using optimum-neuron
Extract the state_dict
Convert to safetensors format
Shard by tensor parallel rank

This is currently not straightforward for compiled Neuron models as the weights are embedded in the compiled format.

Current Structure

The model.pt file contains:

Compiled graphs (NEFF format)
Model weights (optimized for Neuron)
Runtime metadata

The separate directories contain:

context_encoding_model/: NEFF files for context encoding
token_generation_model/: NEFF files for token generation
layout_opt/: Layout optimization artifacts

##Usage

Load this model using:

from optimum.neuron import NeuronModelForCausalLM
model = NeuronModelForCausalLM.from_pretrained("path/to/model")