Judge Arena: Benchmarking LLMs as Evaluators

Benchmarking LLMs as evaluators. Voting has been sunset — the leaderboard below reflects the final community rankings.