Barry Li

Brilliant-B

Brilliant-B

AI & ML interests

None yet

Recent Activity

upvoted a paper about 5 hours ago

Masked Depth Modeling for Spatial Perception

upvoted a paper about 5 hours ago

Towards Pixel-Level VLM Perception via Simple Points Prediction

upvoted a paper about 5 hours ago

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

View all activity

Organizations

None yet

upvoted 4 papers about 5 hours ago

upvoted a paper 8 days ago

Think3D: Thinking with Space for Spatial Reasoning

Paper • 2601.13029 • Published 11 days ago • 46

upvoted a paper 15 days ago

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Paper • 2601.09708 • Published 16 days ago • 51

upvoted a paper about 1 month ago

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published Dec 23, 2025 • 62

upvoted 2 papers 3 months ago

Continuous Autoregressive Language Models

Paper • 2510.27688 • Published Oct 31, 2025 • 73

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published Oct 22, 2025 • 31

upvoted a paper 5 months ago

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search

Paper • 2509.07969 • Published Sep 9, 2025 • 59

liked a Space 5 months ago

FineVision: Open Data is All You Need

📝

217

A new open-source dataset for training VLMs

liked a model 6 months ago

Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 201k • • 2.36k

upvoted 4 papers 6 months ago

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Paper • 2507.12841 • Published Jul 17, 2025 • 42

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Paper • 2507.14137 • Published Jul 18, 2025 • 35

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Paper • 2507.16815 • Published Jul 22, 2025 • 41

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21, 2025 • 69

upvoted a paper 7 months ago

Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models

Paper • 2507.07104 • Published Jul 9, 2025 • 46

liked 2 models 7 months ago

google/siglip-so400m-patch14-384

Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 1.57M • 644

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 777k • • 12.2k

upvoted a paper 7 months ago

UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation

Paper • 2506.17202 • Published Jun 20, 2025 • 10

Barry Li

AI & ML interests

Recent Activity

Organizations

Brilliant-B's activity

FineVision: Open Data is All You Need