Abakar Sylla's picture

20

Abakar Sylla

abakrsylla

AI & ML interests

LLM, zeroth order optimization

Organizations

None yet

upvoted 20 papers 6 months ago

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Paper • 2507.12841 • Published Jul 17, 2025 • 42

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Paper • 2507.13332 • Published Jul 17, 2025 • 49

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17, 2025 • 59

π^3: Scalable Permutation-Equivariant Visual Geometry Learning

Paper • 2507.13347 • Published Jul 17, 2025 • 66

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 79

The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

Paper • 2507.11097 • Published Jul 15, 2025 • 64

Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15, 2025 • 15

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22, 2025 • 63

Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory

Paper • 2507.16713 • Published Jul 22, 2025 • 21

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published Jul 22, 2025 • 35

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Paper • 2507.16815 • Published Jul 22, 2025 • 41

Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers

Paper • 2507.08422 • Published Jul 11, 2025 • 36

Step-Audio 2 Technical Report

Paper • 2507.16632 • Published Jul 22, 2025 • 73

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

Paper • 2507.18192 • Published Jul 24, 2025 • 8

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents

Paper • 2507.19478 • Published Jul 25, 2025 • 32

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment

Paper • 2507.20880 • Published Jul 28, 2025 • 11

Met^2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems

Paper • 2507.17189 • Published Jul 23, 2025 • 13

Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning

Paper • 2507.21049 • Published Jul 28, 2025 • 41

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26, 2025 • 158

ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Paper • 2507.20939 • Published Jul 28, 2025 • 57