view article Article Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech ServiceNow-AI • 4 days ago • 42
view article Article EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios ServiceNow-AI • 9 days ago • 39
EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents Paper • 2605.13841 • Published May 13 • 75
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 149