Lancer

lancer001010

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

upvoted a paper 8 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

upvoted a paper 10 days ago

Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

View all activity

Organizations

None yet

upvoted a paper 7 days ago

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 10 days ago • 159

upvoted a paper 8 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 10 days ago • 194

upvoted a paper 10 days ago

Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

Paper • 2601.03872 • Published 12 days ago • 40

updated a collection 14 days ago

Diffusion

Collection

2 items • Updated 14 days ago

upvoted a paper about 1 month ago

Memory in the Age of AI Agents

Paper • 2512.13564 • Published Dec 15, 2025 • 143

upvoted an article about 2 months ago

Article

Continuous batching from first principles

Nov 25, 2025

•

304

upvoted 2 articles 3 months ago

Article

Supercharge your OCR Pipelines with Open Models

Oct 21, 2025

•

297

Article

mem-agent: Equipping LLM Agents with Memory Using RL

Oct 9, 2025

•

upvoted an article 4 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

upvoted a paper 4 months ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228

liked a model 5 months ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26, 2025 • 12.3k • 1.01k

upvoted a paper 6 months ago

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4, 2025 • 157

published a Space 7 months ago

ChatCat

💬

Interact with a friendly chatbot

upvoted an article 8 months ago

Article

Vision Language Models (Better, faster, stronger)

May 12, 2025

•

586

updated 2 collections 8 months ago

RL

Collection

强化学习相关 • 2 items • Updated 15 days ago

KV Cache 优化

Collection

3 items • Updated May 30, 2025

upvoted 2 articles 9 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

271

Article

I trained a Language Model to schedule events with GRPO!

Apr 29, 2025

•

upvoted a paper 9 months ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10, 2025 • 133

Lancer

AI & ML interests

Recent Activity

Organizations

lancer001010's activity

Continuous batching from first principles

Supercharge your OCR Pipelines with Open Models

mem-agent: Equipping LLM Agents with Memory Using RL

From GRPO to DAPO and GSPO: What, Why, and How

ChatCat

Vision Language Models (Better, faster, stronger)

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

I trained a Language Model to schedule events with GRPO!