codelion commited on
Commit
7ba6110
·
verified ·
1 Parent(s): 39e5c69

Fix citation to reference blog post

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -127,17 +127,19 @@ print(tokenizer.decode(outputs[0]))
127
 
128
  ## Citation
129
 
130
- If you use this model, please cite:
131
 
132
  ```bibtex
133
- @article{gpt2-70m-optimal-mixing,
134
- title={Optimal Pre-training Dataset Composition for Language Models: A Systematic Study of Dataset Mixing Strategies},
135
- author={codelion},
136
  year={2025},
137
- url={https://huggingface.co/codelion/gpt-2-70m}
138
  }
139
  ```
140
 
 
 
141
  ## Model Card Authors
142
 
143
  codelion
 
127
 
128
  ## Citation
129
 
130
+ If you use this model/dataset, please cite:
131
 
132
  ```bibtex
133
+ @article{sharma2025billion,
134
+ title={The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix},
135
+ author={Sharma, Asankhaya},
136
  year={2025},
137
+ url={https://huggingface.co/blog/codelion/optimal-dataset-mixing/}
138
  }
139
  ```
140
 
141
+ For more details, see the [blog post](https://huggingface.co/blog/codelion/optimal-dataset-mixing/).
142
+
143
  ## Model Card Authors
144
 
145
  codelion