Entropy Sentinel: Continuous LLM Accuracy Monitoring from Decoding Entropy Traces in STEM Paper • 2601.09001 • Published 9 days ago • 16
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation Paper • 2601.08430 • Published 10 days ago • 53
Confidence Estimation for LLMs in Multi-turn Interactions Paper • 2601.02179 • Published 18 days ago • 15
Diversity or Precision? A Deep Dive into Next Token Prediction Paper • 2512.22955 • Published 26 days ago • 8
Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published 23 days ago • 38
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents Paper • 2512.23343 • Published 25 days ago • 28
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models Paper • 2512.19995 • Published Dec 23, 2025 • 16
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published Dec 19, 2025 • 50
Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives Paper • 2512.12620 • Published Dec 14, 2025 • 3
State over Tokens: Characterizing the Role of Reasoning Tokens Paper • 2512.12777 • Published Dec 14, 2025 • 5
Rethinking Expert Trajectory Utilization in LLM Post-training Paper • 2512.11470 • Published Dec 12, 2025 • 8
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published Dec 15, 2025 • 107
Interpretable Embeddings with Sparse Autoencoders: A Data Analysis Toolkit Paper • 2512.10092 • Published Dec 10, 2025 • 3
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published Dec 8, 2025 • 38
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward Paper • 2511.03108 • Published Nov 5, 2025 • 4
From Proof to Program: Characterizing Tool-Induced Reasoning Hallucinations in Large Language Models Paper • 2511.10899 • Published Nov 14, 2025 • 4
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning Paper • 2511.06805 • Published Nov 10, 2025 • 13
Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads Paper • 2511.06209 • Published Nov 9, 2025 • 19
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence Paper • 2511.07384 • Published Nov 10, 2025 • 18