Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.08313

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 626
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 313
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • 456B • Updated Jul 9 • 10.2k • 8
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • 456B • Updated Jul 9 • 74 • 6
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11 • 73 • 10
MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 1.48k • 650

MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 1.48k • 650
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • 456B • Updated Jul 3 • 87.7k • 280
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Running

117

MiniMaxText01

💬

117

Generate responses to text and images in a chat interface

Running

13

Inpaint mask maker

👺

13

Swap faces in images with adjustments
deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 142k • • 3.08k
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 6
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 9.6k • 183
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 2.11k • 543
Salesforce/Webscale-RL

Viewer • Updated Oct 14 • 1.11M • 988 • 81

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18 • 4 • 1
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 123k • 208
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated 6 days ago • 53.9k • 427
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 15.3k • 125

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142

wisdom of the ancient

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 259

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

Paper • 2509.19371 • Published Sep 19
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 6
Selective Attention: Enhancing Transformer through Principled Context Control

Paper • 2411.12892 • Published Nov 19, 2024
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

This collection is a list of papers I find to be very interesting.

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 626
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 313
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 210

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Agent-Ark/Toucan-1.5M

Viewer • Updated Oct 4 • 1.65M • 9.6k • 183
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 2.11k • 543
Salesforce/Webscale-RL

Viewer • Updated Oct 14 • 1.11M • 988 • 81

MiniMax (Large Language Model) - Original and Transformers Compatible Weights

MiniMaxAI/MiniMax-Text-01-hf

Text Generation • 456B • Updated Jul 9 • 10.2k • 8
MiniMaxAI/MiniMax-M1-80k-hf

Text Generation • 456B • Updated Jul 9 • 74 • 6
MiniMaxAI/MiniMax-M1-40k-hf

Text Generation • Updated Jul 11 • 73 • 10
MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 1.48k • 650

Rewnozom/agent-zero-v1-a-01

Text Generation • 4B • Updated Jan 18 • 4 • 1
TheBloke/MythoMax-L2-13B-GGUF

13B • Updated Sep 27, 2023 • 123k • 208
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Text Generation • 18B • Updated 6 days ago • 53.9k • 427
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF

Text Generation • 8B • Updated Jul 29, 2024 • 15.3k • 125

MiniMaxAI/MiniMax-Text-01

Text Generation • 456B • Updated Jul 3 • 1.48k • 650
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • 456B • Updated Jul 3 • 87.7k • 280
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
Running

117

MiniMaxText01

💬

117

Generate responses to text and images in a chat interface

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142

Running

13

Inpaint mask maker

👺

13

Swap faces in images with adjustments
deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 142k • • 3.08k
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301

wisdom of the ancient

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 301
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 429
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 259

Previous
1
2
3
...
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs