--- language: - en license: apache-2.0 pipeline_tag: text-generation tags: - chess - neuron - aws-trainium - vllm - optimum-neuron - continuous-batching base_model: karanps/ChessLM_Qwen3 --- # ChessLM Qwen3 - Neuron Traced (AWS Format Structure) This is a Neuron-traced version of [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3) optimized for AWS Trainium (trn2) instances using vLLM. This model follows the AWS Neuron repository structure with separate directories for compiled artifacts. This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops) ## Model Details - **Base Model**: Qwen3-8B fine-tuned for chess - **Compilation**: optimum-neuron[vllm]==0.3.0 - **Compiler Version**: neuronxcc 2.21.33363.0 - **Target Hardware**: AWS Trainium2 (trn2) - **Precision**: BF16 - **Tensor Parallelism**: 2 cores - **Batch Size**: 4 (continuous batching enabled) - **Max Sequence Length**: 2048 ## Compilation instructions ``` optimum-cli export neuron \ --model karanps/ChessLM_Qwen3 \ --task text-generation \ --sequence_length 2048 \ --batch_size 4 \ /home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled ``` ### Key Files - **context_encoding_model/**: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens) - **token_generation_model/**: Compiled NEFF files for autoregressive token generation - **layout_opt/**: Layout optimization artifacts from compilation - **model.pt**: Main model file containing compiled graphs and embedded weights (17GB) - **neuron_config.json**: Neuron compilation configuration ## Model Files | File | Purpose | |------|---------| | model.pt | Main model with embedded weights (17GB) | | config.json | Base model configuration | | neuron_config.json | Neuron compilation settings | | tokenizer* | Tokenizer files for text processing | ## License This model inherits the license from the base model [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3).