Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.14965

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

s3: You Don't Need That Much Data to Train a Search Agent via RL

Paper • 2505.14146 • Published May 20, 2025 • 19
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26, 2025 • 15
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26, 2025 • 45
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26, 2025 • 45

🤔 Reasoning about Reasoning

papers and articles about reasoning LLMs

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27, 2025 • 42
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62
JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published Mar 31, 2025 • 62
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published Apr 8, 2025 • 85

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17, 2025 • 49
LLM360/guru-RL-92k

Viewer • Updated Aug 20, 2025 • 91.9k • 1.7k • 42
LLM360/guru-7B

Text Generation • 8B • Updated Jun 19, 2025 • 288 • • 3
LLM360/guru-32B

Text Generation • 33B • Updated Jun 19, 2025 • 8 • 2

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 120
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31, 2025 • 54
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 187

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17, 2025 • 49
LLM360/guru-RL-92k

Viewer • Updated Aug 20, 2025 • 91.9k • 1.7k • 42
LLM360/guru-7B

Text Generation • 8B • Updated Jun 19, 2025 • 288 • • 3
LLM360/guru-32B

Text Generation • 33B • Updated Jun 19, 2025 • 8 • 2

s3: You Don't Need That Much Data to Train a Search Agent via RL

Paper • 2505.14146 • Published May 20, 2025 • 19
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26, 2025 • 15
ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26, 2025 • 45
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26, 2025 • 45

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18, 2025 • 139
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 120
What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31, 2025 • 54
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 187

🤔 Reasoning about Reasoning

papers and articles about reasoning LLMs

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27, 2025 • 42
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31, 2025 • 62
JudgeLRM: Large Reasoning Models as a Judge

Paper • 2504.00050 • Published Mar 31, 2025 • 62
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published Apr 8, 2025 • 85

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24, 2025 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27, 2025 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28, 2025 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs