Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.21218

Read But Not Implemented

about 21 hours ago

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published 17 days ago • 90
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 220
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published 16 days ago • 199
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published 23 days ago • 24

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published 10 days ago • 63

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1, 2025 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31, 2025 • 1

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Paper • 2312.15715 • Published Dec 25, 2023 • 20
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 68
VideoPrism: A Foundational Visual Encoder for Video Understanding

Paper • 2402.13217 • Published Feb 20, 2024 • 38
Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 159

Multimodal Reasoning

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Paper • 2502.11573 • Published Feb 17, 2025 • 9
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking

Paper • 2502.02339 • Published Feb 4, 2025 • 22
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Paper • 2502.11775 • Published Feb 17, 2025 • 9
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30, 2025 • 3
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 95

Configuration error

39

EVF-SAM

👀

39
Runtime error

Featured

642

FLUX.1 [Inpainting]

🎨

642
Running

243

PolaroidVL 1.0 Demo

👁

243

a mini vision-language ai model
Paused

37

Good Upscaler

🐠

37

Enhance image resolution to 16 megapixels

Read But Not Implemented

about 21 hours ago

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published 17 days ago • 90
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 220
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published 16 days ago • 199
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published 23 days ago • 24

Multimodal Reasoning

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Paper • 2502.11573 • Published Feb 17, 2025 • 9
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking

Paper • 2502.02339 • Published Feb 4, 2025 • 22
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

Paper • 2502.11775 • Published Feb 17, 2025 • 9
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published 10 days ago • 63

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30, 2025 • 3
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 95

Representation & Optimization

Understanding about representation sheds light on optimization

Nuclear Norm Regularization for Deep Learning

Paper • 2405.14544 • Published May 23, 2024 • 1
Token embeddings violate the manifold hypothesis

Paper • 2504.01002 • Published Apr 1, 2025 • 1
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers

Paper • 2403.10476 • Published Mar 15, 2024 • 1
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Paper • 2504.00254 • Published Mar 31, 2025 • 1

Configuration error

39

EVF-SAM

👀

39
Runtime error

Featured

642

FLUX.1 [Inpainting]

🎨

642
Running

243

PolaroidVL 1.0 Demo

👁

243

a mini vision-language ai model
Paused

37

Good Upscaler

🐠

37

Enhance image resolution to 16 megapixels

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

Paper • 2312.15715 • Published Dec 25, 2023 • 20
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 68
VideoPrism: A Foundational Visual Encoder for Video Understanding

Paper • 2402.13217 • Published Feb 20, 2024 • 38
Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10, 2025 • 159

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs