Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17 • 12
Benchmarking Optimizers for Large Language Model Pretraining Paper • 2509.01440 • Published Sep 1 • 24
Towards Open Foundation Language Model and Corpus for Macedonian: A Low-Resource Language Paper • 2506.09560 • Published Jun 11
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 75
Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News Paper • 2410.20198 • Published Oct 26, 2024
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ Paper • 2409.16779 • Published Sep 25, 2024
On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial Paper • 2403.14380 • Published Mar 21, 2024 • 1
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
Evaluating the Search Phase of Neural Architecture Search Paper • 1902.08142 • Published Feb 21, 2019
Landmark Attention: Random-Access Infinite Context Length for Transformers Paper • 2305.16300 • Published May 25, 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention Paper • 2306.01160 • Published Jun 1, 2023 • 1