-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2509.02544
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 316 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Paper • 2509.08721 • Published • 662 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229 -
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 348
-
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
Paper • 2509.06501 • Published • 81 -
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
Baichuan-M2: Scaling Medical Capability with Large Verifier System
Paper • 2509.02208 • Published • 43
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
Paper • 2509.01215 • Published • 51 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 85
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
Paper • 2508.09736 • Published • 58 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 134 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Paper • 2509.08721 • Published • 662 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229 -
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
Paper • 2508.18106 • Published • 348
-
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
Paper • 2509.06501 • Published • 81 -
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
Baichuan-M2: Scaling Medical Capability with Large Verifier System
Paper • 2509.02208 • Published • 43
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 300 -
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 316 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 211
-
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
Paper • 2509.02544 • Published • 125 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229
-
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229 -
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
Paper • 2509.02479 • Published • 84 -
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion
Paper • 2509.01215 • Published • 51 -
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
Paper • 2509.00676 • Published • 85
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
Paper • 2508.09736 • Published • 58 -
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper • 2508.03680 • Published • 134 -
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
Paper • 2503.21460 • Published • 83 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 229