BoolQ_Llama-3.2-1B-eszatdiq

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5571
  • Model Preparation Time: 0.0056
  • Mdl: 12063.3522
  • Accumulated Loss: 8361.6786
  • Correct Preds: 2256.0
  • Total Preds: 3270.0
  • Accuracy: 0.6899
  • Correct Gen Preds: 2181.0
  • Gen Accuracy: 0.6670
  • Correct Gen Preds 9642: 1467.0
  • Correct Preds 9642: 1519.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.7498
  • Gen Accuracy 9642: 0.7241
  • Correct Gen Preds 2822: 706.0
  • Correct Preds 2822: 737.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.5987
  • Gen Accuracy 2822: 0.5735

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0056 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.8982 1.0 3 0.8052 0.0056 3798.5080 2632.9251 1559.0 3270.0 0.4768 1467.0 0.4486 337.0 372.0 2026.0 0.1836 0.1663 1121.0 1187.0 1231.0 0.9643 0.9106
0.3133 2.0 6 0.6938 0.0056 3273.0467 2268.7031 2128.0 3270.0 0.6508 1815.0 0.5550 1609.0 1865.0 2026.0 0.9205 0.7942 197.0 263.0 1231.0 0.2136 0.1600
0.0233 3.0 9 0.7795 0.0056 3677.5836 2549.1067 2216.0 3270.0 0.6777 2161.0 0.6609 1362.0 1401.0 2026.0 0.6915 0.6723 790.0 815.0 1231.0 0.6621 0.6418
0.0001 4.0 12 2.6272 0.0056 12394.1502 8590.9703 2192.0 3270.0 0.6703 2195.0 0.6713 1973.0 1977.0 2026.0 0.9758 0.9738 214.0 215.0 1231.0 0.1747 0.1738
0.0025 5.0 15 2.5778 0.0056 12161.0922 8429.4268 2237.0 3270.0 0.6841 2227.0 0.6810 1771.0 1782.0 2026.0 0.8796 0.8741 448.0 455.0 1231.0 0.3696 0.3639
0.0 6.0 18 2.5571 0.0056 12063.3522 8361.6786 2256.0 3270.0 0.6899 2181.0 0.6670 1467.0 1519.0 2026.0 0.7498 0.7241 706.0 737.0 1231.0 0.5987 0.5735
0.0 7.0 21 2.6065 0.0056 12296.3865 8523.2057 2192.0 3270.0 0.6703 1996.0 0.6104 1273.0 1402.0 2026.0 0.6920 0.6283 715.0 790.0 1231.0 0.6418 0.5808
0.0001 8.0 24 2.6148 0.0056 12335.8294 8550.5454 2175.0 3270.0 0.6651 1910.0 0.5841 1210.0 1395.0 2026.0 0.6885 0.5972 692.0 780.0 1231.0 0.6336 0.5621
0.0 9.0 27 2.6483 0.0056 12493.8025 8660.0440 2170.0 3270.0 0.6636 1920.0 0.5872 1220.0 1396.0 2026.0 0.6890 0.6022 691.0 774.0 1231.0 0.6288 0.5613
0.0001 10.0 30 2.6828 0.0056 12656.5201 8772.8312 2177.0 3270.0 0.6657 1963.0 0.6003 1255.0 1400.0 2026.0 0.6910 0.6194 700.0 777.0 1231.0 0.6312 0.5686
0.0001 11.0 33 2.7214 0.0056 12838.5669 8899.0164 2171.0 3270.0 0.6639 2013.0 0.6156 1279.0 1393.0 2026.0 0.6876 0.6313 725.0 778.0 1231.0 0.6320 0.5890
0.0 12.0 36 2.7415 0.0056 12933.2785 8964.6655 2169.0 3270.0 0.6633 2035.0 0.6223 1301.0 1393.0 2026.0 0.6876 0.6422 726.0 776.0 1231.0 0.6304 0.5898
0.0 13.0 39 2.7593 0.0056 13017.3006 9022.9052 2172.0 3270.0 0.6642 2056.0 0.6287 1313.0 1395.0 2026.0 0.6885 0.6481 734.0 777.0 1231.0 0.6312 0.5963
0.0 14.0 42 2.7708 0.0056 13071.4073 9060.4091 2167.0 3270.0 0.6627 2066.0 0.6318 1322.0 1393.0 2026.0 0.6876 0.6525 736.0 774.0 1231.0 0.6288 0.5979
0.0 15.0 45 2.7767 0.0056 13099.2616 9079.7162 2168.0 3270.0 0.6630 2068.0 0.6324 1320.0 1392.0 2026.0 0.6871 0.6515 740.0 776.0 1231.0 0.6304 0.6011
0.0 16.0 48 2.7824 0.0056 13126.3414 9098.4865 2169.0 3270.0 0.6633 2077.0 0.6352 1325.0 1391.0 2026.0 0.6866 0.6540 743.0 778.0 1231.0 0.6320 0.6036
0.0 17.0 51 2.7841 0.0056 13134.4015 9104.0734 2165.0 3270.0 0.6621 2078.0 0.6355 1328.0 1392.0 2026.0 0.6871 0.6555 742.0 773.0 1231.0 0.6279 0.6028
0.0 18.0 54 2.7872 0.0056 13148.8380 9114.0800 2171.0 3270.0 0.6639 2082.0 0.6367 1331.0 1397.0 2026.0 0.6895 0.6570 742.0 774.0 1231.0 0.6288 0.6028
0.0 19.0 57 2.7901 0.0056 13162.4860 9123.5401 2171.0 3270.0 0.6639 2081.0 0.6364 1327.0 1393.0 2026.0 0.6876 0.6550 745.0 778.0 1231.0 0.6320 0.6052
0.0 20.0 60 2.7935 0.0056 13178.7743 9134.8302 2172.0 3270.0 0.6642 2083.0 0.6370 1330.0 1398.0 2026.0 0.6900 0.6565 745.0 774.0 1231.0 0.6288 0.6052
0.0 21.0 63 2.7929 0.0056 13175.9740 9132.8892 2167.0 3270.0 0.6627 2080.0 0.6361 1328.0 1393.0 2026.0 0.6876 0.6555 743.0 774.0 1231.0 0.6288 0.6036
0.0 22.0 66 2.7951 0.0056 13186.0428 9139.8684 2175.0 3270.0 0.6651 2087.0 0.6382 1331.0 1397.0 2026.0 0.6895 0.6570 748.0 778.0 1231.0 0.6320 0.6076
0.0 23.0 69 2.7974 0.0056 13196.9785 9147.4485 2171.0 3270.0 0.6639 2089.0 0.6388 1330.0 1394.0 2026.0 0.6881 0.6565 751.0 777.0 1231.0 0.6312 0.6101
0.0 24.0 72 2.7988 0.0056 13203.5576 9152.0087 2172.0 3270.0 0.6642 2089.0 0.6388 1333.0 1395.0 2026.0 0.6885 0.6579 748.0 777.0 1231.0 0.6312 0.6076
0.0 25.0 75 2.8010 0.0056 13214.0329 9159.2696 2172.0 3270.0 0.6642 2093.0 0.6401 1335.0 1396.0 2026.0 0.6890 0.6589 749.0 776.0 1231.0 0.6304 0.6084
0.0 26.0 78 2.8012 0.0056 13214.8892 9159.8632 2174.0 3270.0 0.6648 2088.0 0.6385 1332.0 1397.0 2026.0 0.6895 0.6575 748.0 777.0 1231.0 0.6312 0.6076
0.0 27.0 81 2.8035 0.0056 13225.9128 9167.5042 2172.0 3270.0 0.6642 2092.0 0.6398 1333.0 1394.0 2026.0 0.6881 0.6579 751.0 778.0 1231.0 0.6320 0.6101
0.0 28.0 84 2.8045 0.0056 13230.6764 9170.8061 2172.0 3270.0 0.6642 2095.0 0.6407 1337.0 1395.0 2026.0 0.6885 0.6599 750.0 777.0 1231.0 0.6312 0.6093
0.0 29.0 87 2.8054 0.0056 13234.8323 9173.6867 2171.0 3270.0 0.6639 2090.0 0.6391 1333.0 1396.0 2026.0 0.6890 0.6579 749.0 775.0 1231.0 0.6296 0.6084
0.0 30.0 90 2.8060 0.0056 13237.7898 9175.7367 2175.0 3270.0 0.6651 2094.0 0.6404 1335.0 1396.0 2026.0 0.6890 0.6589 751.0 779.0 1231.0 0.6328 0.6101
0.0 31.0 93 2.8078 0.0056 13246.1557 9181.5355 2168.0 3270.0 0.6630 2091.0 0.6394 1335.0 1393.0 2026.0 0.6876 0.6589 747.0 775.0 1231.0 0.6296 0.6068
0.0 32.0 96 2.8082 0.0056 13247.9959 9182.8110 2169.0 3270.0 0.6633 2095.0 0.6407 1337.0 1393.0 2026.0 0.6876 0.6599 749.0 776.0 1231.0 0.6304 0.6084
0.0 33.0 99 2.8077 0.0056 13245.4286 9181.0315 2173.0 3270.0 0.6645 2100.0 0.6422 1338.0 1396.0 2026.0 0.6890 0.6604 753.0 777.0 1231.0 0.6312 0.6117
0.0 34.0 102 2.8115 0.0056 13263.6309 9193.6484 2169.0 3270.0 0.6633 2091.0 0.6394 1333.0 1394.0 2026.0 0.6881 0.6579 749.0 775.0 1231.0 0.6296 0.6084
0.0 35.0 105 2.8099 0.0056 13255.9181 9188.3022 2174.0 3270.0 0.6648 2095.0 0.6407 1339.0 1397.0 2026.0 0.6895 0.6609 748.0 777.0 1231.0 0.6312 0.6076
0.0 36.0 108 2.8103 0.0056 13258.0305 9189.7664 2173.0 3270.0 0.6645 2098.0 0.6416 1339.0 1397.0 2026.0 0.6895 0.6609 750.0 776.0 1231.0 0.6304 0.6093

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
-
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-eszatdiq

Finetuned
(883)
this model