-
LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation
Paper • 2403.05246 • Published -
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
Paper • 2201.01266 • Published • 3 -
IAUNet: Instance-Aware U-Net
Paper • 2508.01928 • Published • 9 -
xLSTM-UNet can be an Effective 2D \& 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart
Paper • 2407.01530 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2508.01928
-
LocalMamba: Visual State Space Model with Windowed Selective Scan
Paper • 2403.09338 • Published • 9 -
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Paper • 2403.09394 • Published • 27 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30
-
The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models
Paper • 2507.23313 • Published • 1 -
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
Paper • 2508.03448 • Published • 4 -
C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor
Paper • 2508.01311 • Published • 2 -
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model
Paper • 2505.21179 • Published • 13
-
AgroBench: Vision-Language Model Benchmark in Agriculture
Paper • 2507.20519 • Published • 7 -
WisWheat: A Three-Tiered Vision-Language Dataset for Wheat Management
Paper • 2506.06084 • Published -
AnimalClue: Recognizing Animals by their Traces
Paper • 2507.20240 • Published • 9 -
Foundations of Large Language Models
Paper • 2501.09223 • Published • 13
-
LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation
Paper • 2403.05246 • Published -
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
Paper • 2201.01266 • Published • 3 -
IAUNet: Instance-Aware U-Net
Paper • 2508.01928 • Published • 9 -
xLSTM-UNet can be an Effective 2D \& 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart
Paper • 2407.01530 • Published • 1
-
The Cow of Rembrandt - Analyzing Artistic Prompt Interpretation in Text-to-Image Models
Paper • 2507.23313 • Published • 1 -
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
Paper • 2508.03448 • Published • 4 -
C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor
Paper • 2508.01311 • Published • 2 -
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model
Paper • 2505.21179 • Published • 13
-
AgroBench: Vision-Language Model Benchmark in Agriculture
Paper • 2507.20519 • Published • 7 -
WisWheat: A Three-Tiered Vision-Language Dataset for Wheat Management
Paper • 2506.06084 • Published -
AnimalClue: Recognizing Animals by their Traces
Paper • 2507.20240 • Published • 9 -
Foundations of Large Language Models
Paper • 2501.09223 • Published • 13
-
LocalMamba: Visual State Space Model with Windowed Selective Scan
Paper • 2403.09338 • Published • 9 -
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Paper • 2403.09394 • Published • 27 -
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Paper • 2402.19479 • Published • 35 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30