gigastaufan
's Collections
Paper
updated
CoRAG: Collaborative Retrieval-Augmented Generation
Paper
•
2504.01883
•
Published
•
9
SQL-R1: Training Natural Language to SQL Reasoning Model By
Reinforcement Learning
Paper
•
2504.08600
•
Published
•
32
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards
for Reasoning-Enhanced Text-to-SQL
Paper
•
2503.23157
•
Published
•
10
AI Agents: Evolution, Architecture, and Real-World Applications
Paper
•
2503.12687
•
Published
•
2
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents
Paper
•
2505.03570
•
Published
•
8
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Paper
•
2505.10320
•
Published
•
24
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
Paper
•
2502.01113
•
Published
•
6
From Local to Global: A Graph RAG Approach to Query-Focused
Summarization
Paper
•
2404.16130
•
Published
•
7
Large Language Models are Locally Linear Mappings
Paper
•
2505.24293
•
Published
•
14
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical
Understanding and Reasoning
Paper
•
2506.07044
•
Published
•
113
Paper
•
2506.10892
•
Published
•
37
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement
Learning
Paper
•
2506.09049
•
Published
•
37
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
•
2506.18871
•
Published
•
78
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image
Generation
Paper
•
2506.18095
•
Published
•
66
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable
Reinforcement Learning
Paper
•
2507.01006
•
Published
•
249
Does Math Reasoning Improve General LLM Capabilities? Understanding
Transferability of LLM Reasoning
Paper
•
2507.00432
•
Published
•
79
Fast and Simplex: 2-Simplicial Attention in Triton
Paper
•
2507.02754
•
Published
•
25
Coding Triangle: How Does Large Language Model Understand Code?
Paper
•
2507.06138
•
Published
•
21
KV Cache Steering for Inducing Reasoning in Small Language Models
Paper
•
2507.08799
•
Published
•
40
MUR: Momentum Uncertainty guided Reasoning for Large Language Models
Paper
•
2507.14958
•
Published
•
46
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper
•
2508.01191
•
Published
•
238
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper
•
2508.03680
•
Published
•
122
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
•
2508.06471
•
Published
•
195
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Paper
•
2508.08221
•
Published
•
50
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm
Bridging Foundation Models and Lifelong Agentic Systems
Paper
•
2508.07407
•
Published
•
98
Speed Always Wins: A Survey on Efficient Architectures for Large
Language Models
Paper
•
2508.09834
•
Published
•
53
Provable Benefits of In-Tool Learning for Large Language Models
Paper
•
2508.20755
•
Published
•
11
AWorld: Orchestrating the Training Recipe for Agentic AI
Paper
•
2508.20404
•
Published
•
38
Think in Games: Learning to Reason in Games via Reinforcement Learning
with Large Language Models
Paper
•
2508.21365
•
Published
•
29
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
•
2509.02547
•
Published
•
228
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for
Interactive Complex World Generation
Paper
•
2509.05263
•
Published
•
10
Revolutionizing Reinforcement Learning Framework for Diffusion Large
Language Models
Paper
•
2509.06949
•
Published
•
55
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn
Tool-Integrated Reasoning
Paper
•
2509.02479
•
Published
•
83
Lost in Embeddings: Information Loss in Vision-Language Models
Paper
•
2509.11986
•
Published
•
28
Regression Language Models for Code
Paper
•
2509.26476
•
Published
•
16
Multi-Agent Tool-Integrated Policy Optimization
Paper
•
2510.04678
•
Published
•
30
In-the-Flow Agentic System Optimization for Effective Planning and Tool
Use
Paper
•
2510.05592
•
Published
•
106
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning
Paper
•
2510.13515
•
Published
•
11
RAG-Anything: All-in-One RAG Framework
Paper
•
2510.12323
•
Published
•
54
LLM-guided Hierarchical Retrieval
Paper
•
2510.13217
•
Published
•
20
Every Attention Matters: An Efficient Hybrid Architecture for
Long-Context Reasoning
Paper
•
2510.19338
•
Published
•
114
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper
•
2512.02472
•
Published
•
51
Who's Your Judge? On the Detectability of LLM-Generated Judgments
Paper
•
2509.25154
•
Published
•
29