# Weights Information This model contains weights bundled within model.pt (17GB). In the AWS Neuron reference format, weights are typically stored separately as: - `weights/tp0_sharded_checkpoint.safetensors` - `weights/tp1_sharded_checkpoint.safetensors` To extract weights to safetensors format, you would need to: 1. Load the model using optimum-neuron 2. Extract the state_dict 3. Convert to safetensors format 4. Shard by tensor parallel rank This is currently not straightforward for compiled Neuron models as the weights are embedded in the compiled format. ## Current Structure The model.pt file contains: - Compiled graphs (NEFF format) - Model weights (optimized for Neuron) - Runtime metadata The separate directories contain: - `context_encoding_model/`: NEFF files for context encoding - `token_generation_model/`: NEFF files for token generation - `layout_opt/`: Layout optimization artifacts ##Usage Load this model using: ```python from optimum.neuron import NeuronModelForCausalLM model = NeuronModelForCausalLM.from_pretrained("path/to/model") ```