--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceFW/fineweb-edu language: - en --- # Model Details This model is a 1B llama3 model pretrained from scratch with torchtitan on fineweb-edu with AdamW optimizer. 20B tokens seen. # How to use ``` import torch from transformers import pipeline pipe = pipeline( "text-generation", model="kz919/llama3_1b_chinchilla_8252025", ) print(pipe("The key to life is")) ``` # Downstream Eval ## ARC, Hellaswag, Lambda_OpenAI, OpenbookQA, PIQA ``` lm_eval --model hf --model_args pretrained=kz919/llama3_1b_chinchilla_8252025,dtype="bfloat16",add_bos_token=True --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,openbookqa --device cuda:7 --batch_size 8 ``` | Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |--------------|------:|------|-----:|----------|---|------:|---|-----:| |arc_challenge | 1|none | 0|acc |↑ | 0.2619|± |0.0128| | | |none | 0|acc_norm |↑ | 0.2884|± |0.0132| |arc_easy | 1|none | 0|acc |↑ | 0.6275|± |0.0099| | | |none | 0|acc_norm |↑ | 0.5417|± |0.0102| |hellaswag | 1|none | 0|acc |↑ | 0.3601|± |0.0048| | | |none | 0|acc_norm |↑ | 0.4397|± |0.0050| |lambada_openai| 1|none | 0|acc |↑ | 0.3559|± |0.0067| | | |none | 0|perplexity|↓ |30.5040|± |1.1781| |openbookqa | 1|none | 0|acc |↑ | 0.2420|± |0.0192| | | |none | 0|acc_norm |↑ | 0.3660|± |0.0216| |piqa | 1|none | 0|acc |↑ | 0.6752|± |0.0109| | | |none | 0|acc_norm |↑ | 0.6877|± |0.0108| ## MMLU | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.2584|± |0.0037| | - humanities | 2|none | |acc |↑ |0.2480|± |0.0063| | - other | 2|none | |acc |↑ |0.2642|± |0.0079| | - social sciences| 2|none | |acc |↑ |0.2655|± |0.0079| | - stem | 2|none | |acc |↑ |0.2613|± |0.0078|