Instructions to use donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7") model = AutoModelForCausalLM.from_pretrained("donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7
- SGLang
How to use donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7 with Docker Model Runner:
docker model run hf.co/donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7
GSM8K-Binary_Llama-3.2-1B-rtd7v6w7
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8309
- Model Preparation Time: 0.0055
- Mdl: 2966.8667
- Accumulated Loss: 2056.4753
- Correct Preds: 1984.0
- Total Preds: 2475.0
- Accuracy: 0.8016
- Correct Gen Preds: 1758.0
- Gen Accuracy: 0.7103
- Correct Gen Preds 34192: 869.0
- Correct Preds 34192: 1016.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8495
- Gen Accuracy 34192: 0.7266
- Correct Gen Preds 41568: 881.0
- Correct Preds 41568: 968.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7640
- Gen Accuracy 41568: 0.6953
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0055 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.7154 | 1.0 | 25 | 0.6989 | 0.0055 | 2495.4759 | 1729.7321 | 1514.0 | 2475.0 | 0.6117 | 949.0 | 0.3834 | 661.0 | 1174.0 | 1196.0 | 0.9816 | 0.5527 | 280.0 | 340.0 | 1267.0 | 0.2684 | 0.2210 |
| 0.9619 | 2.0 | 50 | 0.7183 | 0.0055 | 2564.7178 | 1777.7269 | 1557.0 | 2475.0 | 0.6291 | 23.0 | 0.0093 | 0.0 | 338.0 | 1196.0 | 0.2826 | 0.0 | 15.0 | 1219.0 | 1267.0 | 0.9621 | 0.0118 |
| 0.7749 | 3.0 | 75 | 0.5842 | 0.0055 | 2086.0604 | 1445.9469 | 1857.0 | 2475.0 | 0.7503 | 260.0 | 0.1051 | 0.0 | 718.0 | 1196.0 | 0.6003 | 0.0 | 252.0 | 1139.0 | 1267.0 | 0.8990 | 0.1989 |
| 0.3418 | 4.0 | 100 | 1.0916 | 0.0055 | 3897.6253 | 2701.6280 | 1690.0 | 2475.0 | 0.6828 | 379.0 | 0.1531 | 272.0 | 1174.0 | 1196.0 | 0.9816 | 0.2274 | 99.0 | 516.0 | 1267.0 | 0.4073 | 0.0781 |
| 0.5389 | 5.0 | 125 | 0.7000 | 0.0055 | 2499.5846 | 1732.5800 | 1936.0 | 2475.0 | 0.7822 | 353.0 | 0.1426 | 115.0 | 1040.0 | 1196.0 | 0.8696 | 0.0962 | 231.0 | 896.0 | 1267.0 | 0.7072 | 0.1823 |
| 0.4228 | 6.0 | 150 | 0.8309 | 0.0055 | 2966.8667 | 2056.4753 | 1984.0 | 2475.0 | 0.8016 | 1758.0 | 0.7103 | 869.0 | 1016.0 | 1196.0 | 0.8495 | 0.7266 | 881.0 | 968.0 | 1267.0 | 0.7640 | 0.6953 |
| 0.0038 | 7.0 | 175 | 1.0896 | 0.0055 | 3890.7175 | 2696.8399 | 1907.0 | 2475.0 | 0.7705 | 1087.0 | 0.4392 | 596.0 | 1096.0 | 1196.0 | 0.9164 | 0.4983 | 484.0 | 811.0 | 1267.0 | 0.6401 | 0.3820 |
| 0.0096 | 8.0 | 200 | 1.1357 | 0.0055 | 4055.3565 | 2810.9589 | 1951.0 | 2475.0 | 0.7883 | 1898.0 | 0.7669 | 963.0 | 987.0 | 1196.0 | 0.8253 | 0.8052 | 927.0 | 964.0 | 1267.0 | 0.7609 | 0.7316 |
| 0.3927 | 9.0 | 225 | 1.4010 | 0.0055 | 5002.5226 | 3467.4844 | 1976.0 | 2475.0 | 0.7984 | 1937.0 | 0.7826 | 1017.0 | 1046.0 | 1196.0 | 0.8746 | 0.8503 | 913.0 | 930.0 | 1267.0 | 0.7340 | 0.7206 |
| 0.0001 | 10.0 | 250 | 1.2540 | 0.0055 | 4477.7630 | 3103.7488 | 1983.0 | 2475.0 | 0.8012 | 1946.0 | 0.7863 | 964.0 | 993.0 | 1196.0 | 0.8303 | 0.8060 | 975.0 | 990.0 | 1267.0 | 0.7814 | 0.7695 |
| 0.3922 | 11.0 | 275 | 1.3906 | 0.0055 | 4965.3047 | 3441.6870 | 1964.0 | 2475.0 | 0.7935 | 1932.0 | 0.7806 | 1029.0 | 1047.0 | 1196.0 | 0.8754 | 0.8604 | 896.0 | 917.0 | 1267.0 | 0.7238 | 0.7072 |
| 0.0 | 12.0 | 300 | 1.4206 | 0.0055 | 5072.5476 | 3516.0220 | 1966.0 | 2475.0 | 0.7943 | 1947.0 | 0.7867 | 1030.0 | 1046.0 | 1196.0 | 0.8746 | 0.8612 | 910.0 | 920.0 | 1267.0 | 0.7261 | 0.7182 |
| 0.3921 | 13.0 | 325 | 1.4252 | 0.0055 | 5089.0899 | 3527.4883 | 1967.0 | 2475.0 | 0.7947 | 1954.0 | 0.7895 | 1029.0 | 1041.0 | 1196.0 | 0.8704 | 0.8604 | 918.0 | 926.0 | 1267.0 | 0.7309 | 0.7245 |
| 0.0 | 14.0 | 350 | 1.4296 | 0.0055 | 5104.8061 | 3538.3819 | 1968.0 | 2475.0 | 0.7952 | 1956.0 | 0.7903 | 1027.0 | 1040.0 | 1196.0 | 0.8696 | 0.8587 | 922.0 | 928.0 | 1267.0 | 0.7324 | 0.7277 |
| 0.3921 | 15.0 | 375 | 1.4327 | 0.0055 | 5115.6967 | 3545.9308 | 1973.0 | 2475.0 | 0.7972 | 1964.0 | 0.7935 | 1029.0 | 1040.0 | 1196.0 | 0.8696 | 0.8604 | 928.0 | 933.0 | 1267.0 | 0.7364 | 0.7324 |
| 0.3921 | 16.0 | 400 | 1.4372 | 0.0055 | 5131.6445 | 3556.9849 | 1972.0 | 2475.0 | 0.7968 | 1961.0 | 0.7923 | 1028.0 | 1039.0 | 1196.0 | 0.8687 | 0.8595 | 926.0 | 933.0 | 1267.0 | 0.7364 | 0.7309 |
| 0.0 | 17.0 | 425 | 1.4412 | 0.0055 | 5146.1676 | 3567.0516 | 1972.0 | 2475.0 | 0.7968 | 1961.0 | 0.7923 | 1027.0 | 1038.0 | 1196.0 | 0.8679 | 0.8587 | 927.0 | 934.0 | 1267.0 | 0.7372 | 0.7316 |
| 0.7841 | 18.0 | 450 | 1.4461 | 0.0055 | 5163.7167 | 3579.2156 | 1970.0 | 2475.0 | 0.7960 | 1962.0 | 0.7927 | 1026.0 | 1037.0 | 1196.0 | 0.8671 | 0.8579 | 929.0 | 933.0 | 1267.0 | 0.7364 | 0.7332 |
| 0.0 | 19.0 | 475 | 1.4499 | 0.0055 | 5177.0033 | 3588.4253 | 1972.0 | 2475.0 | 0.7968 | 1966.0 | 0.7943 | 1027.0 | 1036.0 | 1196.0 | 0.8662 | 0.8587 | 932.0 | 936.0 | 1267.0 | 0.7388 | 0.7356 |
| 0.0 | 20.0 | 500 | 1.4508 | 0.0055 | 5180.3062 | 3590.7146 | 1971.0 | 2475.0 | 0.7964 | 1964.0 | 0.7935 | 1025.0 | 1035.0 | 1196.0 | 0.8654 | 0.8570 | 932.0 | 936.0 | 1267.0 | 0.7388 | 0.7356 |
| 0.3921 | 21.0 | 525 | 1.4534 | 0.0055 | 5189.7817 | 3597.2826 | 1974.0 | 2475.0 | 0.7976 | 1969.0 | 0.7956 | 1029.0 | 1038.0 | 1196.0 | 0.8679 | 0.8604 | 933.0 | 936.0 | 1267.0 | 0.7388 | 0.7364 |
| 0.0 | 22.0 | 550 | 1.4580 | 0.0055 | 5206.0889 | 3608.5858 | 1971.0 | 2475.0 | 0.7964 | 1964.0 | 0.7935 | 1028.0 | 1038.0 | 1196.0 | 0.8679 | 0.8595 | 929.0 | 933.0 | 1267.0 | 0.7364 | 0.7332 |
| 0.0 | 23.0 | 575 | 1.4600 | 0.0055 | 5213.0440 | 3613.4067 | 1975.0 | 2475.0 | 0.7980 | 1968.0 | 0.7952 | 1027.0 | 1037.0 | 1196.0 | 0.8671 | 0.8587 | 934.0 | 938.0 | 1267.0 | 0.7403 | 0.7372 |
| 0.0 | 24.0 | 600 | 1.4608 | 0.0055 | 5216.0428 | 3615.4854 | 1975.0 | 2475.0 | 0.7980 | 1970.0 | 0.7960 | 1027.0 | 1036.0 | 1196.0 | 0.8662 | 0.8587 | 936.0 | 939.0 | 1267.0 | 0.7411 | 0.7388 |
| 0.0 | 25.0 | 625 | 1.4642 | 0.0055 | 5228.0668 | 3623.8198 | 1973.0 | 2475.0 | 0.7972 | 1967.0 | 0.7947 | 1028.0 | 1037.0 | 1196.0 | 0.8671 | 0.8595 | 932.0 | 936.0 | 1267.0 | 0.7388 | 0.7356 |
| 0.0 | 26.0 | 650 | 1.4671 | 0.0055 | 5238.4672 | 3631.0288 | 1973.0 | 2475.0 | 0.7972 | 1969.0 | 0.7956 | 1025.0 | 1034.0 | 1196.0 | 0.8645 | 0.8570 | 937.0 | 939.0 | 1267.0 | 0.7411 | 0.7395 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-rtd7v6w7
Base model
meta-llama/Llama-3.2-1B