calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0652

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.9455 1.0 5 2.0822
1.8943 2.0 10 1.6374
1.5309 3.0 15 1.3630
1.2901 4.0 20 1.1525
1.0707 5.0 25 0.9432
0.9185 6.0 30 0.8454
0.8187 7.0 35 0.7501
0.7215 8.0 40 0.6560
0.6404 9.0 45 0.5970
0.5953 10.0 50 0.5479
0.5476 11.0 55 0.5130
0.5134 12.0 60 0.4796
0.4773 13.0 65 0.4578
0.4509 14.0 70 0.4150
0.4219 15.0 75 0.3884
0.3962 16.0 80 0.3658
0.3700 17.0 85 0.3397
0.3504 18.0 90 0.3145
0.3254 19.0 95 0.2937
0.3058 20.0 100 0.2701
0.2856 21.0 105 0.2564
0.2637 22.0 110 0.2273
0.2426 23.0 115 0.2088
0.2264 24.0 120 0.1859
0.2062 25.0 125 0.1618
0.1867 26.0 130 0.1333
0.1655 27.0 135 0.1178
0.1550 28.0 140 0.1166
0.1431 29.0 145 0.1050
0.1325 30.0 150 0.0944
0.1235 31.0 155 0.0880
0.1156 32.0 160 0.0855
0.1110 33.0 165 0.0805
0.1057 34.0 170 0.0745
0.1042 35.0 175 0.0722
0.0986 36.0 180 0.0710
0.0970 37.0 185 0.0672
0.0952 38.0 190 0.0666
0.0926 39.0 195 0.0655
0.0935 40.0 200 0.0652

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
100
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support