The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation
Paper
•
2510.23393
•
Published
•
20
None defined yet.
On Pretraining for Project-Level Code Completion
Diff-XYZ: A Benchmark for Evaluating Diff Understanding