BoolQ_Llama-3.2-1B-eszatdiq
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.5571
- Model Preparation Time: 0.0056
- Mdl: 12063.3522
- Accumulated Loss: 8361.6786
- Correct Preds: 2256.0
- Total Preds: 3270.0
- Accuracy: 0.6899
- Correct Gen Preds: 2181.0
- Gen Accuracy: 0.6670
- Correct Gen Preds 9642: 1467.0
- Correct Preds 9642: 1519.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.7498
- Gen Accuracy 9642: 0.7241
- Correct Gen Preds 2822: 706.0
- Correct Preds 2822: 737.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.5987
- Gen Accuracy 2822: 0.5735
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0056 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.8982 | 1.0 | 3 | 0.8052 | 0.0056 | 3798.5080 | 2632.9251 | 1559.0 | 3270.0 | 0.4768 | 1467.0 | 0.4486 | 337.0 | 372.0 | 2026.0 | 0.1836 | 0.1663 | 1121.0 | 1187.0 | 1231.0 | 0.9643 | 0.9106 |
| 0.3133 | 2.0 | 6 | 0.6938 | 0.0056 | 3273.0467 | 2268.7031 | 2128.0 | 3270.0 | 0.6508 | 1815.0 | 0.5550 | 1609.0 | 1865.0 | 2026.0 | 0.9205 | 0.7942 | 197.0 | 263.0 | 1231.0 | 0.2136 | 0.1600 |
| 0.0233 | 3.0 | 9 | 0.7795 | 0.0056 | 3677.5836 | 2549.1067 | 2216.0 | 3270.0 | 0.6777 | 2161.0 | 0.6609 | 1362.0 | 1401.0 | 2026.0 | 0.6915 | 0.6723 | 790.0 | 815.0 | 1231.0 | 0.6621 | 0.6418 |
| 0.0001 | 4.0 | 12 | 2.6272 | 0.0056 | 12394.1502 | 8590.9703 | 2192.0 | 3270.0 | 0.6703 | 2195.0 | 0.6713 | 1973.0 | 1977.0 | 2026.0 | 0.9758 | 0.9738 | 214.0 | 215.0 | 1231.0 | 0.1747 | 0.1738 |
| 0.0025 | 5.0 | 15 | 2.5778 | 0.0056 | 12161.0922 | 8429.4268 | 2237.0 | 3270.0 | 0.6841 | 2227.0 | 0.6810 | 1771.0 | 1782.0 | 2026.0 | 0.8796 | 0.8741 | 448.0 | 455.0 | 1231.0 | 0.3696 | 0.3639 |
| 0.0 | 6.0 | 18 | 2.5571 | 0.0056 | 12063.3522 | 8361.6786 | 2256.0 | 3270.0 | 0.6899 | 2181.0 | 0.6670 | 1467.0 | 1519.0 | 2026.0 | 0.7498 | 0.7241 | 706.0 | 737.0 | 1231.0 | 0.5987 | 0.5735 |
| 0.0 | 7.0 | 21 | 2.6065 | 0.0056 | 12296.3865 | 8523.2057 | 2192.0 | 3270.0 | 0.6703 | 1996.0 | 0.6104 | 1273.0 | 1402.0 | 2026.0 | 0.6920 | 0.6283 | 715.0 | 790.0 | 1231.0 | 0.6418 | 0.5808 |
| 0.0001 | 8.0 | 24 | 2.6148 | 0.0056 | 12335.8294 | 8550.5454 | 2175.0 | 3270.0 | 0.6651 | 1910.0 | 0.5841 | 1210.0 | 1395.0 | 2026.0 | 0.6885 | 0.5972 | 692.0 | 780.0 | 1231.0 | 0.6336 | 0.5621 |
| 0.0 | 9.0 | 27 | 2.6483 | 0.0056 | 12493.8025 | 8660.0440 | 2170.0 | 3270.0 | 0.6636 | 1920.0 | 0.5872 | 1220.0 | 1396.0 | 2026.0 | 0.6890 | 0.6022 | 691.0 | 774.0 | 1231.0 | 0.6288 | 0.5613 |
| 0.0001 | 10.0 | 30 | 2.6828 | 0.0056 | 12656.5201 | 8772.8312 | 2177.0 | 3270.0 | 0.6657 | 1963.0 | 0.6003 | 1255.0 | 1400.0 | 2026.0 | 0.6910 | 0.6194 | 700.0 | 777.0 | 1231.0 | 0.6312 | 0.5686 |
| 0.0001 | 11.0 | 33 | 2.7214 | 0.0056 | 12838.5669 | 8899.0164 | 2171.0 | 3270.0 | 0.6639 | 2013.0 | 0.6156 | 1279.0 | 1393.0 | 2026.0 | 0.6876 | 0.6313 | 725.0 | 778.0 | 1231.0 | 0.6320 | 0.5890 |
| 0.0 | 12.0 | 36 | 2.7415 | 0.0056 | 12933.2785 | 8964.6655 | 2169.0 | 3270.0 | 0.6633 | 2035.0 | 0.6223 | 1301.0 | 1393.0 | 2026.0 | 0.6876 | 0.6422 | 726.0 | 776.0 | 1231.0 | 0.6304 | 0.5898 |
| 0.0 | 13.0 | 39 | 2.7593 | 0.0056 | 13017.3006 | 9022.9052 | 2172.0 | 3270.0 | 0.6642 | 2056.0 | 0.6287 | 1313.0 | 1395.0 | 2026.0 | 0.6885 | 0.6481 | 734.0 | 777.0 | 1231.0 | 0.6312 | 0.5963 |
| 0.0 | 14.0 | 42 | 2.7708 | 0.0056 | 13071.4073 | 9060.4091 | 2167.0 | 3270.0 | 0.6627 | 2066.0 | 0.6318 | 1322.0 | 1393.0 | 2026.0 | 0.6876 | 0.6525 | 736.0 | 774.0 | 1231.0 | 0.6288 | 0.5979 |
| 0.0 | 15.0 | 45 | 2.7767 | 0.0056 | 13099.2616 | 9079.7162 | 2168.0 | 3270.0 | 0.6630 | 2068.0 | 0.6324 | 1320.0 | 1392.0 | 2026.0 | 0.6871 | 0.6515 | 740.0 | 776.0 | 1231.0 | 0.6304 | 0.6011 |
| 0.0 | 16.0 | 48 | 2.7824 | 0.0056 | 13126.3414 | 9098.4865 | 2169.0 | 3270.0 | 0.6633 | 2077.0 | 0.6352 | 1325.0 | 1391.0 | 2026.0 | 0.6866 | 0.6540 | 743.0 | 778.0 | 1231.0 | 0.6320 | 0.6036 |
| 0.0 | 17.0 | 51 | 2.7841 | 0.0056 | 13134.4015 | 9104.0734 | 2165.0 | 3270.0 | 0.6621 | 2078.0 | 0.6355 | 1328.0 | 1392.0 | 2026.0 | 0.6871 | 0.6555 | 742.0 | 773.0 | 1231.0 | 0.6279 | 0.6028 |
| 0.0 | 18.0 | 54 | 2.7872 | 0.0056 | 13148.8380 | 9114.0800 | 2171.0 | 3270.0 | 0.6639 | 2082.0 | 0.6367 | 1331.0 | 1397.0 | 2026.0 | 0.6895 | 0.6570 | 742.0 | 774.0 | 1231.0 | 0.6288 | 0.6028 |
| 0.0 | 19.0 | 57 | 2.7901 | 0.0056 | 13162.4860 | 9123.5401 | 2171.0 | 3270.0 | 0.6639 | 2081.0 | 0.6364 | 1327.0 | 1393.0 | 2026.0 | 0.6876 | 0.6550 | 745.0 | 778.0 | 1231.0 | 0.6320 | 0.6052 |
| 0.0 | 20.0 | 60 | 2.7935 | 0.0056 | 13178.7743 | 9134.8302 | 2172.0 | 3270.0 | 0.6642 | 2083.0 | 0.6370 | 1330.0 | 1398.0 | 2026.0 | 0.6900 | 0.6565 | 745.0 | 774.0 | 1231.0 | 0.6288 | 0.6052 |
| 0.0 | 21.0 | 63 | 2.7929 | 0.0056 | 13175.9740 | 9132.8892 | 2167.0 | 3270.0 | 0.6627 | 2080.0 | 0.6361 | 1328.0 | 1393.0 | 2026.0 | 0.6876 | 0.6555 | 743.0 | 774.0 | 1231.0 | 0.6288 | 0.6036 |
| 0.0 | 22.0 | 66 | 2.7951 | 0.0056 | 13186.0428 | 9139.8684 | 2175.0 | 3270.0 | 0.6651 | 2087.0 | 0.6382 | 1331.0 | 1397.0 | 2026.0 | 0.6895 | 0.6570 | 748.0 | 778.0 | 1231.0 | 0.6320 | 0.6076 |
| 0.0 | 23.0 | 69 | 2.7974 | 0.0056 | 13196.9785 | 9147.4485 | 2171.0 | 3270.0 | 0.6639 | 2089.0 | 0.6388 | 1330.0 | 1394.0 | 2026.0 | 0.6881 | 0.6565 | 751.0 | 777.0 | 1231.0 | 0.6312 | 0.6101 |
| 0.0 | 24.0 | 72 | 2.7988 | 0.0056 | 13203.5576 | 9152.0087 | 2172.0 | 3270.0 | 0.6642 | 2089.0 | 0.6388 | 1333.0 | 1395.0 | 2026.0 | 0.6885 | 0.6579 | 748.0 | 777.0 | 1231.0 | 0.6312 | 0.6076 |
| 0.0 | 25.0 | 75 | 2.8010 | 0.0056 | 13214.0329 | 9159.2696 | 2172.0 | 3270.0 | 0.6642 | 2093.0 | 0.6401 | 1335.0 | 1396.0 | 2026.0 | 0.6890 | 0.6589 | 749.0 | 776.0 | 1231.0 | 0.6304 | 0.6084 |
| 0.0 | 26.0 | 78 | 2.8012 | 0.0056 | 13214.8892 | 9159.8632 | 2174.0 | 3270.0 | 0.6648 | 2088.0 | 0.6385 | 1332.0 | 1397.0 | 2026.0 | 0.6895 | 0.6575 | 748.0 | 777.0 | 1231.0 | 0.6312 | 0.6076 |
| 0.0 | 27.0 | 81 | 2.8035 | 0.0056 | 13225.9128 | 9167.5042 | 2172.0 | 3270.0 | 0.6642 | 2092.0 | 0.6398 | 1333.0 | 1394.0 | 2026.0 | 0.6881 | 0.6579 | 751.0 | 778.0 | 1231.0 | 0.6320 | 0.6101 |
| 0.0 | 28.0 | 84 | 2.8045 | 0.0056 | 13230.6764 | 9170.8061 | 2172.0 | 3270.0 | 0.6642 | 2095.0 | 0.6407 | 1337.0 | 1395.0 | 2026.0 | 0.6885 | 0.6599 | 750.0 | 777.0 | 1231.0 | 0.6312 | 0.6093 |
| 0.0 | 29.0 | 87 | 2.8054 | 0.0056 | 13234.8323 | 9173.6867 | 2171.0 | 3270.0 | 0.6639 | 2090.0 | 0.6391 | 1333.0 | 1396.0 | 2026.0 | 0.6890 | 0.6579 | 749.0 | 775.0 | 1231.0 | 0.6296 | 0.6084 |
| 0.0 | 30.0 | 90 | 2.8060 | 0.0056 | 13237.7898 | 9175.7367 | 2175.0 | 3270.0 | 0.6651 | 2094.0 | 0.6404 | 1335.0 | 1396.0 | 2026.0 | 0.6890 | 0.6589 | 751.0 | 779.0 | 1231.0 | 0.6328 | 0.6101 |
| 0.0 | 31.0 | 93 | 2.8078 | 0.0056 | 13246.1557 | 9181.5355 | 2168.0 | 3270.0 | 0.6630 | 2091.0 | 0.6394 | 1335.0 | 1393.0 | 2026.0 | 0.6876 | 0.6589 | 747.0 | 775.0 | 1231.0 | 0.6296 | 0.6068 |
| 0.0 | 32.0 | 96 | 2.8082 | 0.0056 | 13247.9959 | 9182.8110 | 2169.0 | 3270.0 | 0.6633 | 2095.0 | 0.6407 | 1337.0 | 1393.0 | 2026.0 | 0.6876 | 0.6599 | 749.0 | 776.0 | 1231.0 | 0.6304 | 0.6084 |
| 0.0 | 33.0 | 99 | 2.8077 | 0.0056 | 13245.4286 | 9181.0315 | 2173.0 | 3270.0 | 0.6645 | 2100.0 | 0.6422 | 1338.0 | 1396.0 | 2026.0 | 0.6890 | 0.6604 | 753.0 | 777.0 | 1231.0 | 0.6312 | 0.6117 |
| 0.0 | 34.0 | 102 | 2.8115 | 0.0056 | 13263.6309 | 9193.6484 | 2169.0 | 3270.0 | 0.6633 | 2091.0 | 0.6394 | 1333.0 | 1394.0 | 2026.0 | 0.6881 | 0.6579 | 749.0 | 775.0 | 1231.0 | 0.6296 | 0.6084 |
| 0.0 | 35.0 | 105 | 2.8099 | 0.0056 | 13255.9181 | 9188.3022 | 2174.0 | 3270.0 | 0.6648 | 2095.0 | 0.6407 | 1339.0 | 1397.0 | 2026.0 | 0.6895 | 0.6609 | 748.0 | 777.0 | 1231.0 | 0.6312 | 0.6076 |
| 0.0 | 36.0 | 108 | 2.8103 | 0.0056 | 13258.0305 | 9189.7664 | 2173.0 | 3270.0 | 0.6645 | 2098.0 | 0.6416 | 1339.0 | 1397.0 | 2026.0 | 0.6895 | 0.6609 | 750.0 | 776.0 | 1231.0 | 0.6304 | 0.6093 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- -
Model tree for donoway/BoolQ_Llama-3.2-1B-eszatdiq
Base model
meta-llama/Llama-3.2-1B