# Weights Information

This model contains weights bundled within model.pt (17GB).

In the AWS Neuron reference format, weights are typically stored separately as:
- `weights/tp0_sharded_checkpoint.safetensors`
- `weights/tp1_sharded_checkpoint.safetensors`

To extract weights to safetensors format, you would need to:
1. Load the model using optimum-neuron
2. Extract the state_dict
3. Convert to safetensors format
4. Shard by tensor parallel rank

This is currently not straightforward for compiled Neuron models as the weights
are embedded in the compiled format.

## Current Structure

The model.pt file contains:
- Compiled graphs (NEFF format)
- Model weights (optimized for Neuron)
- Runtime metadata

The separate directories contain:
- `context_encoding_model/`: NEFF files for context encoding
- `token_generation_model/`: NEFF files for token generation
- `layout_opt/`: Layout optimization artifacts

##Usage

Load this model using:
```python
from optimum.neuron import NeuronModelForCausalLM
model = NeuronModelForCausalLM.from_pretrained("path/to/model")
```