shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-swebench-k5-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated 2 days ago • 233
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-k5-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated 13 days ago • 17
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-instructions-k10-opus-distill-32k-lr5e6-multiturn Updated 16 days ago
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-instructions-k10-opus-distill-32k-lr5e6-flattened Text Generation • 1B • Updated 17 days ago • 18
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated 19 days ago • 492
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-flattened Text Generation • 1B • Updated 20 days ago • 400
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean_think Text Generation • 1B • Updated Mar 28 • 23
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean Text Generation • 1B • Updated Mar 27 • 47
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_rejection-sample_think Text Generation • 1B • Updated Mar 27 • 22