train_sst2_456_1768397598

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0826
  • Num Input Tokens Seen: 30591040

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0007 0.5000 15154 0.1241 1530624
0.1653 1.0000 30308 0.1001 3061264
0.0001 1.5000 45462 0.1004 4592688
0.002 2.0001 60616 0.0826 6121056
0.0016 2.5001 75770 0.0927 7651360
0.0003 3.0001 90924 0.0885 9179584
0.0005 3.5001 106078 0.0986 10707168
0.1854 4.0001 121232 0.0835 12236912
0.0008 4.5001 136386 0.0984 13765568
0.0013 5.0002 151540 0.1004 15295536
0.0009 5.5002 166694 0.1026 16829808
0.5641 6.0002 181848 0.1018 18356448
0.0017 6.5002 197002 0.1006 19882880
0.0002 7.0002 212156 0.1187 21413584
0.0003 7.5002 227310 0.1153 22943200
0.0002 8.0003 242464 0.1163 24473632
0.3504 8.5003 257618 0.1170 26002800
0.0001 9.0003 272772 0.1145 27533936
0.0001 9.5003 287926 0.1172 29061488

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_456_1768397598

Adapter
(2367)
this model