Paul S PRO
SuperPauly
AI & ML interests
None yet
Recent Activity
liked a model about 7 hours ago
datalab-to/chandra-ocr-2 liked a model about 16 hours ago
prism-ml/Ternary-Bonsai-8B-mlx-2bit liked a model 3 days ago
OpenMOSS-Team/MOSS-TTS-Nano-100MOrganizations
None yet
Evaluation Methods & Metrics
-
RubricBench: Aligning Model-Generated Rubrics with Human Standards
Paper • 2603.01562 • Published • 63 -
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Paper • 2603.03790 • Published • 121 -
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Paper • 2505.20411 • Published • 96 -
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Paper • 2602.23866 • Published • 88
Py
Demixing Models & Datasets
-
Moisesdb: A dataset for source separation beyond 4-stems
Paper • 2307.15913 • Published -
Music Source Separation with Band-Split RoPE Transformer
Paper • 2309.02612 • Published • 1 -
Hybrid Transformers for Music Source Separation
Paper • 2211.08553 • Published • 1 -
nvidia/RE-USE
Audio-to-Audio • Updated • 6.88k • 61
Agent Loops, Character, Work Ethics & Behavior
-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 6 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 3 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 61 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 41
Sample Upscaling & Denoising.
Demixing Models & Datasets
-
Moisesdb: A dataset for source separation beyond 4-stems
Paper • 2307.15913 • Published -
Music Source Separation with Band-Split RoPE Transformer
Paper • 2309.02612 • Published • 1 -
Hybrid Transformers for Music Source Separation
Paper • 2211.08553 • Published • 1 -
nvidia/RE-USE
Audio-to-Audio • Updated • 6.88k • 61
Evaluation Methods & Metrics
-
RubricBench: Aligning Model-Generated Rubrics with Human Standards
Paper • 2603.01562 • Published • 63 -
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Paper • 2603.03790 • Published • 121 -
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Paper • 2505.20411 • Published • 96 -
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale
Paper • 2602.23866 • Published • 88
Agent Loops, Character, Work Ethics & Behavior
-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 6 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 3 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 61 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 41
Py