Open to Collab

Mike Ravkine PRO

mike-ravkine

the-crypt-keeper

AI & ML interests

LLM Research / Development / Evaluation

Recent Activity

liked a model 1 day ago

google/gemma-4-31B-it

liked a model 2 days ago

Jackrong/Qwopus3.5-9B-v3

upvoted an article 4 days ago

How I contributed a new model to the Transformers library using Codex

View all activity

Organizations

liked a model 1 day ago

google/gemma-4-31B-it

Image-Text-to-Text • 33B • Updated 2 days ago • 287k • • 758

liked a model 2 days ago

Jackrong/Qwopus3.5-9B-v3

Image-Text-to-Text • 10B • Updated 2 days ago • 2.8k • 58

upvoted an article 4 days ago

Article

How I contributed a new model to the Transformers library using Codex

5 days ago

•

liked a model 8 days ago

nvidia/gpt-oss-puzzle-88B

Text Generation • 91B • Updated 9 days ago • 16.7k • 90

upvoted a paper 9 days ago

Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

Paper • 2603.08462 • Published 26 days ago • 21

liked a model 11 days ago

Nanbeige/Nanbeige4-3B-Thinking-2511

Text Generation • 4B • Updated Dec 17, 2025 • 2.21k • 205

liked a model 16 days ago

Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2

Image-Text-to-Text • 10B • Updated 12 days ago • 41.9k • 149

liked a model 18 days ago

mistralai/Mistral-Small-4-119B-2603

119B • Updated 10 days ago • 68.4k • 345

liked a model 23 days ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-FP8

Text Generation • 124B • Updated 11 days ago • 1.07M • 216

liked a model 26 days ago

QuantTrio/Qwen3.5-397B-A17B-AWQ

Image-Text-to-Text • Updated Mar 2 • 13.2k • 8

liked a model 27 days ago

LLM360/K2-Think-V2

Text Generation • 73B • Updated Mar 2 • 2.49k • 28

liked a model 28 days ago

inclusionAI/Ring-flash-2.0

Text Generation • 103B • Updated Oct 23, 2025 • 97 • 101

posted an update about 1 month ago

Post

287

gpt-oss-120b has held on to the ReasonScape crown since it's release on Aug 5, 2025 - 7 months in the LLM space is *impressive*.

With the release of Qwen-3.5 the king has been dethroned by not one but 2 models the mid-dense Qwen/Qwen3.5-27B and the large-MoE Qwen/Qwen3.5-122B-A10B-FP8.

The old king is dead - long live the new king 👑

Note that these rankings are based on r12 - a 27k prompts, 12 task domain 3rd iteration of the ReasonScape evaluation. Compared to the previous m12x ranking this evaluation fixes a slew of test bugs, refines the task set to add table-extraction, and lifts the context ceiling to 16k - so these rankings are quite a bit different vs the previous m12x Leaderboard (which has an 8k context limit).