SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
Jiaming Zhang
·
Shengming Cao
·
Rui Li
·
Xiaotong Zhao
·
Yutao Cui
Xinglin Hou
·
Gangshan Wu
·
Haolan Chen
·
Yu Xu
·
Limin Wang
·
Kai Ma
Multimedia Computing Group, Nanjing University | Platform and Content Group (PCG), Tencent
This repository is the checkpoint of paper "SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation". SteadyDancer is a strong animation framework based on Image-to-Video paradigm, ensuring robust first-frame preservation. In contrast to prior Reference-to-Video approaches that often suffer from identity drift due to spatio-temporal misalignments common in real-world applications, SteadyDancer generates high-fidelity and temporally coherent human animations, outperforming existing methods in visual quality and control while requiring significantly fewer training resources.
Notice
- This is a diffusers gguf ,not a comfyUI gguf ,注意,这是基于diffuser的管线量化的gguf模型,如果使用comfyUI原生加载,需要用city96的量化方式,或者加载时候修改键名以适配comfyUI的模型结构
pipeline
from diffusers import GGUFQuantizationConfig,WanTransformer3DModel,WanVideoToVideoPipeline
from transformers import UMT5EncoderModel
from diffusers.models import AutoencoderKLWan
gguf_path="https://huggingface.co/smthem/SteadyDancer-14B-gguf/blob/main/SteadyDancer-14B-Q8_0.gguf"
model_id="Wan-AI/Wan2.1-I2V-14B-720P-Diffusers"
transformer = WanTransformer3DModel.from_single_file(
gguf_path,
config=model_id,
quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
torch_dtype=torch.bfloat16,
)
vae=AutoencoderKLWan.from_pretrained(model_id, torch_dtype=torch.bfloat16,)
text_encoder=UMT5EncoderModel.from_pretrained(model_id, torch_dtype=torch.bfloat16,)
pipe = WanVideoToVideoPipeline.from_pretrained(model_id, vae=vae,transformer=transformer,text_encoder=text_encoder, torch_dtype=torch.bfloat16)
# run infer
...
📚 Citation
If you find our paper or this codebase useful for your research, please cite us.
@misc{zhang2025steadydancer,
title={SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation},
author={Jiaming Zhang and Shengming Cao and Rui Li and Xiaotong Zhao and Yutao Cui and Xinglin Hou and Gangshan Wu and Haolan Chen and Yu Xu and Limin Wang and Kai Ma},
year={2025},
eprint={2511.19320},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2511.19320},
}
- Downloads last month
- 30,558
8-bit
16-bit
Model tree for smthem/SteadyDancer-14B-gguf
Base model
Wan-AI/Wan2.1-I2V-14B-480P