Critique to Verify: Accurate and Honest Test-Time Scaling with RL-Trained Verifiers (https://arxiv.org/abs/2509.23152)
Zhicheng YANG
yangzhch6
AI & ML interests
reasoning with LLMs
Recent Activity
updated a model about 9 hours ago
yangzhch6/tool-verl-qwen1.5B-200step-0320 published a model about 9 hours ago
yangzhch6/tool-verl-qwen1.5B-200step-0320 updated a model 6 days ago
yangzhch6/maxrl-qwen3-4b-base-dapo-bs128-n16-stepp400Organizations
None yet
models 33
yangzhch6/tool-verl-qwen1.5B-200step-0320
2B • Updated
yangzhch6/maxrl-qwen3-4b-base-dapo-bs128-n16-stepp400
4B • Updated • 14
yangzhch6/Qwen2.5-Math-7B-Think32k
Text Generation • 8B • Updated • 15
yangzhch6/Qwen2.5-Math-7B-Think32k-Openr1ColdStart46k-Syn
333k • Updated • 13
yangzhch6/Qwen2.5-Math-7B-Think32k-Openr1ColdStart46k
333k • Updated • 11
yangzhch6/Qwen2.5-Math-7B-16k-Think-Synthesizer
8B • Updated
yangzhch6/cuda-12.8-tar
Updated
yangzhch6/cuda-12.8
Updated
yangzhch6/Mirror-Verifier-1.5B
2B • Updated
yangzhch6/Mirror-Verifier-7B
8B • Updated
datasets 16
yangzhch6/Accordion-Thinking-Synthetic-Data
Viewer • Updated • 14.7k • 12
yangzhch6/DeepInformal-DeepTheorem-Synthetic
Viewer • Updated • 404k • 25 • 1
yangzhch6/DeepInformal-Openr1-Math-46K-Synthetic
Viewer • Updated • 165k • 19
yangzhch6/compare-openr1
Viewer • Updated • 45.8k • 8
yangzhch6/Align-Openr1-Math-46k
Viewer • Updated • 45.8k • 10
yangzhch6/DeepInformal-test
Viewer • Updated • 405 • 14
yangzhch6/DeepInformal-Putnam-1995-2024
Viewer • Updated • 356 • 7 • 1
yangzhch6/DeepInformal-DeepTheorem-DeepSeek-84k
Viewer • Updated • 84.1k • 21
yangzhch6/Putnam-Informal-1995-2024
Viewer • Updated • 360 • 11 • 1
yangzhch6/cuda-12.8-tar
Updated • 5