1 22 8

Peng-Tao Jiang

ptjiang

https://pengtaojiang.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs

upvoted a paper 5 days ago

ViDiC: Video Difference Captioning

upvoted a paper 5 days ago

OneThinker: All-in-one Reasoning Model for Image and Video

View all activity

Organizations

None yet

upvoted 6 papers 5 days ago

AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs

Paper • 2511.20515 • Published 13 days ago • 3

ViDiC: Video Difference Captioning

Paper • 2512.03405 • Published 6 days ago • 25

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published 6 days ago • 29

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Paper • 2512.03746 • Published 6 days ago • 15

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Paper • 2512.03534 • Published 6 days ago • 18

CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation

Paper • 2512.03540 • Published 6 days ago • 11

upvoted 2 papers 8 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 11 days ago • 164

REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

Paper • 2511.22625 • Published 11 days ago • 45

authored 10 papers 12 days ago

Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections

Paper • 2304.08706 • Published Apr 18, 2023

Towards Natural Image Matting in the Wild via Real-Scenario Prior

Paper • 2410.06593 • Published Oct 9, 2024 • 4

ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution

Paper • 2410.13807 • Published Oct 17, 2024

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity

Paper • 2410.10105 • Published Oct 14, 2024 • 3

Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning

Paper • 2505.12370 • Published May 18

MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on

Paper • 2505.21325 • Published May 27 • 4

HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions

Paper • 2505.22977 • Published May 29 • 1

A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models

Paper • 2508.01548 • Published Aug 3 • 13

VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models

Paper • 2511.11007 • Published 25 days ago • 15

MagicWorld: Interactive Geometry-driven Video World Exploration

Paper • 2511.18886 • Published 15 days ago • 17

upvoted 2 papers 12 days ago

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

Paper • 2511.18922 • Published 15 days ago • 10

VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models

Paper • 2511.11007 • Published 25 days ago • 15

Peng-Tao Jiang

AI & ML interests

Recent Activity

Organizations

ptjiang's activity