calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9455	1.0	5	2.0822
1.8943	2.0	10	1.6374
1.5309	3.0	15	1.3630
1.2901	4.0	20	1.1525
1.0707	5.0	25	0.9432
0.9185	6.0	30	0.8454
0.8187	7.0	35	0.7501
0.7215	8.0	40	0.6560
0.6404	9.0	45	0.5970
0.5953	10.0	50	0.5479
0.5476	11.0	55	0.5130
0.5134	12.0	60	0.4796
0.4773	13.0	65	0.4578
0.4509	14.0	70	0.4150
0.4219	15.0	75	0.3884
0.3962	16.0	80	0.3658
0.3700	17.0	85	0.3397
0.3504	18.0	90	0.3145
0.3254	19.0	95	0.2937
0.3058	20.0	100	0.2701
0.2856	21.0	105	0.2564
0.2637	22.0	110	0.2273
0.2426	23.0	115	0.2088
0.2264	24.0	120	0.1859
0.2062	25.0	125	0.1618
0.1867	26.0	130	0.1333
0.1655	27.0	135	0.1178
0.1550	28.0	140	0.1166
0.1431	29.0	145	0.1050
0.1325	30.0	150	0.0944
0.1235	31.0	155	0.0880
0.1156	32.0	160	0.0855
0.1110	33.0	165	0.0805
0.1057	34.0	170	0.0745
0.1042	35.0	175	0.0722
0.0986	36.0	180	0.0710
0.0970	37.0	185	0.0672
0.0952	38.0	190	0.0666
0.0926	39.0	195	0.0655
0.0935	40.0	200	0.0652

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support