---
library_name: transformers
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb-edu
language:
- en
---

# Model Details
This model is a 1B llama3 model pretrained from scratch with torchtitan on fineweb-edu with AdamW optimizer. 20B tokens seen.

# How to use
```
import torch
from transformers import pipeline


pipe = pipeline(
    "text-generation",
    model="kz919/llama3_1b_chinchilla_8252025",
)

print(pipe("The key to life is"))
```

# Downstream Eval
## ARC, Hellaswag, Lambda_OpenAI, OpenbookQA, PIQA
```
lm_eval --model hf --model_args pretrained=kz919/llama3_1b_chinchilla_8252025,dtype="bfloat16",add_bos_token=True --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,openbookqa --device cuda:7 --batch_size 8
```
|    Tasks     |Version|Filter|n-shot|  Metric  |   | Value |   |Stderr|
|--------------|------:|------|-----:|----------|---|------:|---|-----:|
|arc_challenge |      1|none  |     0|acc       |↑  | 0.2619|±  |0.0128|
|              |       |none  |     0|acc_norm  |↑  | 0.2884|±  |0.0132|
|arc_easy      |      1|none  |     0|acc       |↑  | 0.6275|±  |0.0099|
|              |       |none  |     0|acc_norm  |↑  | 0.5417|±  |0.0102|
|hellaswag     |      1|none  |     0|acc       |↑  | 0.3601|±  |0.0048|
|              |       |none  |     0|acc_norm  |↑  | 0.4397|±  |0.0050|
|lambada_openai|      1|none  |     0|acc       |↑  | 0.3559|±  |0.0067|
|              |       |none  |     0|perplexity|↓  |30.5040|±  |1.1781|
|openbookqa    |      1|none  |     0|acc       |↑  | 0.2420|±  |0.0192|
|              |       |none  |     0|acc_norm  |↑  | 0.3660|±  |0.0216|
|piqa          |      1|none  |     0|acc       |↑  | 0.6752|±  |0.0109|
|              |       |none  |     0|acc_norm  |↑  | 0.6877|±  |0.0108|


## MMLU
|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |↑  |0.2584|±  |0.0037|
| - humanities     |      2|none  |      |acc   |↑  |0.2480|±  |0.0063|
| - other          |      2|none  |      |acc   |↑  |0.2642|±  |0.0079|
| - social sciences|      2|none  |      |acc   |↑  |0.2655|±  |0.0079|
| - stem           |      2|none  |      |acc   |↑  |0.2613|±  |0.0078|