Fang Wu's picture

7 15 7

Fang Wu

fangwu97

·

https://smiles724.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

liked a Space about 1 month ago

HuggingFaceTB/smol-training-playbook

upvoted a paper about 1 month ago

L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks

View all activity

Organizations

upvoted a paper 18 days ago

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published 20 days ago • 105

upvoted a paper about 1 month ago

L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks

Paper • 2510.20976 • Published Oct 23 • 2

upvoted 7 papers 2 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 496

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 105

Trading-R1: Financial Trading with LLM Reasoning via Reinforcement Learning

Paper • 2509.11420 • Published Sep 14 • 2

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1 • 18

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

Diagnose, Localize, Align: A Full-Stack Framework for Reliable LLM Multi-Agent Systems under Instruction Conflicts

Paper • 2509.23188 • Published Sep 27 • 3

Multiplayer Nash Preference Optimization

Paper • 2509.23102 • Published Sep 27 • 62

upvoted 2 papers 4 months ago

VeriGUI: Verifiable Long-Chain GUI Dataset

Paper • 2508.04026 • Published Aug 6 • 160

CellForge: Agentic Design of Virtual Cell Models

Paper • 2508.02276 • Published Aug 4 • 39

upvoted a paper 5 months ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 85

upvoted 2 papers 6 months ago

When to Trust Context: Self-Reflective Debates for Context Reliability

Paper • 2506.06020 • Published Jun 6 • 1

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27 • 26

upvoted a paper 9 months ago

MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning

Paper • 2503.07459 • Published Mar 10 • 16