Jianhao Yan's picture

5 13 3

Jianhao Yan

Elliott

·

ElliottYan

AI & ML interests

None yet

Organizations

None yet

commented a paper 8 months ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88 •

commented a paper 9 months ago

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 98 •

New activity in Elliott/Qwen2.5-Math-7B-16k-think 9 months ago

Add library name, pipeline tag, link to Github

#1 opened 9 months ago by

New activity in Elliott/Openr1-Math-46k-8192 9 months ago

Add task category

#2 opened 9 months ago by

New activity in Elliott/LUFFY-Qwen-Math-7B-Zero 9 months ago

Correct pipeline tag and add Github link

#1 opened 9 months ago by

commented 2 papers 9 months ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88 •

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88 •