8057787115cea7a35a21adff8257c2b8

This model is a fine-tuned version of facebook/opt-2.7b on the contemmcm/trec dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Accuracy	F1 Macro
No log	0	0	2.0083	0	2.2461	0.1812	0.1132
No log	1	170	1.0066	0.0078	2.9566	0.625	0.4449
No log	2	340	0.4294	0.0156	5.7032	0.8938	0.7475
No log	3	510	0.9452	0.0312	10.2384	0.6937	0.6915
No log	4	680	1.7359	0.0625	13.9943	0.2958	0.2710
0.0787	5	850	0.6879	0.125	19.7794	0.7604	0.6332
0.0787	6	1020	0.4806	0.25	29.5295	0.8896	0.8602

Safetensors

Model size

0.7B params

Tensor type

F32

Base model

Finetuned

(21)

this model