Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.01928

LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation

Paper • 2403.05246 • Published Mar 8, 2024
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images

Paper • 2201.01266 • Published Jan 4, 2022 • 3
IAUNet: Instance-Aware U-Net

Paper • 2508.01928 • Published Aug 3 • 9
xLSTM-UNet can be an Effective 2D \& 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart

Paper • 2407.01530 • Published Jul 1, 2024 • 1

(Former) Instance Segmentation

IAUNet: Instance-Aware U-Net

Paper • 2508.01928 • Published Aug 3 • 9
Mask2Former for Video Instance Segmentation

Paper • 2112.10764 • Published Dec 20, 2021 • 1
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

Paper • 2206.02777 • Published Jun 6, 2022

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 9
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 27
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 30

Papers Pertinent or Protuberant

The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models

Paper • 2507.23313 • Published Jul 31 • 1
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

Paper • 2508.03448 • Published Aug 5 • 4
C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor

Paper • 2508.01311 • Published Aug 2 • 2
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model

Paper • 2505.21179 • Published May 27 • 13

AgroBench: Vision-Language Model Benchmark in Agriculture

Paper • 2507.20519 • Published Jul 28 • 7
WisWheat: A Three-Tiered Vision-Language Dataset for Wheat Management

Paper • 2506.06084 • Published Jun 6
AnimalClue: Recognizing Animals by their Traces

Paper • 2507.20240 • Published Jul 27 • 9
Foundations of Large Language Models

Paper • 2501.09223 • Published Jan 16 • 13

LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation

Paper • 2403.05246 • Published Mar 8, 2024
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images

Paper • 2201.01266 • Published Jan 4, 2022 • 3
IAUNet: Instance-Aware U-Net

Paper • 2508.01928 • Published Aug 3 • 9
xLSTM-UNet can be an Effective 2D \& 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart

Paper • 2407.01530 • Published Jul 1, 2024 • 1

Papers Pertinent or Protuberant

The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models

Paper • 2507.23313 • Published Jul 31 • 1
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering

Paper • 2508.03448 • Published Aug 5 • 4
C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor

Paper • 2508.01311 • Published Aug 2 • 2
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model

Paper • 2505.21179 • Published May 27 • 13

(Former) Instance Segmentation

IAUNet: Instance-Aware U-Net

Paper • 2508.01928 • Published Aug 3 • 9
Mask2Former for Video Instance Segmentation

Paper • 2112.10764 • Published Dec 20, 2021 • 1
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

Paper • 2206.02777 • Published Jun 6, 2022

AgroBench: Vision-Language Model Benchmark in Agriculture

Paper • 2507.20519 • Published Jul 28 • 7
WisWheat: A Three-Tiered Vision-Language Dataset for Wheat Management

Paper • 2506.06084 • Published Jun 6
AnimalClue: Recognizing Animals by their Traces

Paper • 2507.20240 • Published Jul 27 • 9
Foundations of Large Language Models

Paper • 2501.09223 • Published Jan 16 • 13

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14, 2024 • 9
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14, 2024 • 27
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29, 2024 • 35
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16, 2024 • 30

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs