Benchmarking LLMs as evaluators. Voting has been sunset — the leaderboard below reflects the final community rankings.