Instructions to use doge1516/MS-Diffusion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use doge1516/MS-Diffusion with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("doge1516/MS-Diffusion", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
Possibility of replacing base pretrained models for inference
Hello!
I was reading the documentation for this model.
Under the hood, it uses https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 and https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k.
I was wondering..is it possible to replace them to smaller models during inference? For example, https://huggingface.co/segmind/Segmind-Vega and https://huggingface.co/openai/clip-vit-large-patch14.
MS-Diffusion's trainable adapters are built on SDXL and CLIP-G. They transform the CLIP image features into SDXL cross-attention tokens. A distilled SDXL can be used if it has the same cross-attention layers. However, since the output image features of CLIP-L and CLIP-G are different in shape, CLIP-G cannot be replaced by CLIP-L.