SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Paper
• 2505.23932 • Published
Fundamental Al Methods; Perception & World Modeling; Reasoning & Generation; Action & Interaction
SIN-Bench: Tracing Native Evidence Chains in Long-Context Multimodal Scientific Interleaved Literature
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection