SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Jiaming Zhang · Shengming Cao · Rui Li · Xiaotong Zhao · Yutao Cui
Xinglin Hou · Gangshan Wu · Haolan Chen · Yu Xu · Limin Wang · Kai Ma

Multimedia Computing Group, Nanjing University | Platform and Content Group (PCG), Tencent

This repository is the checkpoint of paper "SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation". SteadyDancer is a strong animation framework based on Image-to-Video paradigm, ensuring robust first-frame preservation. In contrast to prior Reference-to-Video approaches that often suffer from identity drift due to spatio-temporal misalignments common in real-world applications, SteadyDancer generates high-fidelity and temporally coherent human animations, outperforming existing methods in visual quality and control while requiring significantly fewer training resources.

Notice

This is a diffusers gguf ，not a comfyUI gguf ，注意，这是基于diffuser的管线量化的gguf模型，如果使用comfyUI原生加载，需要用city96的量化方式，或者加载时候修改键名以适配comfyUI的模型结构

pipeline

from diffusers import  GGUFQuantizationConfig,WanTransformer3DModel,WanVideoToVideoPipeline
from transformers import UMT5EncoderModel
from diffusers.models import AutoencoderKLWan

gguf_path="https://huggingface.co/smthem/SteadyDancer-14B-gguf/blob/main/SteadyDancer-14B-Q8_0.gguf"
model_id="Wan-AI/Wan2.1-I2V-14B-720P-Diffusers" 

transformer = WanTransformer3DModel.from_single_file(
        gguf_path,
        config=model_id,
        quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
        torch_dtype=torch.bfloat16,
        )

vae=AutoencoderKLWan.from_pretrained(model_id, torch_dtype=torch.bfloat16,)
text_encoder=UMT5EncoderModel.from_pretrained(model_id, torch_dtype=torch.bfloat16,)

pipe = WanVideoToVideoPipeline.from_pretrained(model_id, vae=vae,transformer=transformer,text_encoder=text_encoder, torch_dtype=torch.bfloat16)

# run infer
...

📚 Citation

If you find our paper or this codebase useful for your research, please cite us.

@misc{zhang2025steadydancer,
      title={SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation}, 
      author={Jiaming Zhang and Shengming Cao and Rui Li and Xiaotong Zhao and Yutao Cui and Xinglin Hou and Gangshan Wu and Haolan Chen and Yu Xu and Limin Wang and Kai Ma},
      year={2025},
      eprint={2511.19320},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.19320}, 
}

Downloads last month: 30,558

GGUF

Model size

16B params

Architecture

wan

Hardware compatibility

8-bit

16-bit

Model tree for smthem/SteadyDancer-14B-gguf

Base model

Wan-AI/Wan2.1-I2V-14B-480P

Quantized

(3)

this model

smthem
/

SteadyDancer-14B-gguf

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Notice

pipeline

📚 Citation

Model tree for smthem/SteadyDancer-14B-gguf

Dataset used to train smthem/SteadyDancer-14B-gguf