RUNE XX WORKFLOWS are the most amazing thing that has happened to me - Check out this video guys!

#100

by Liquidmind111 - opened 15 days ago

RUNE, THANKS TO YOU, I managed to make one very difficult experiment:

🎵 Mr. Sawed-Off Leather Face (LiquidMind Mix) | LTX 2.3 AI Music Video & Dual Lip Sync

This project is a massive technical experiment pushing the boundaries of local AI video generation. "Mr. Sawed-Off Leather Face (LiquidMind Mix)" features a seamless 2:30-minute dual-vocal track, and we successfully achieved perfect phonetic lip-sync for both singers using a single, uncut audio input.

If you are an AI filmmaker or ComfyUI user, you know how hard it is to get an AI video model to switch speakers mid-clip. Here is a deep dive into how this was made!

🛠️ The Dual-Vocal Technical Flex
Most AI music videos separate male and female vocals into different audio tracks to render lip-sync accurately. For this experiment, the entire 2:30 audio track was kept whole. We engineered the prompts in LTX 2.3 to force the model to recognize the mid-clip vocal hand-offs between Anthony and Angie. The AI had to seamlessly transition the active singing animation from one character to the other while keeping the listening character's micro-expressions completely natural.

🧠 Prompt Engineering Secrets: The "Anchor + Action Sequence"
To maintain strict temporal consistency and facial coherence across both subjects, we utilized a highly detailed narrative prompting method:

The Anchor: Every single prompt started with an identical, rock-solid base description of the environment, lighting, and exact clothing/positions of the characters.

The Action Sequence: We injected the specific lyrics directly into the prompt alongside a chronological cue (e.g., "As the audio transitions, the man takes over the vocals..."). This gave LTX a precise roadmap to keep the living room stable while shifting the singing action.

💻 Hardware Constraints & Scene Workflow (Why the Cuts?)
You might notice the scenes cut every 20 seconds. Why not render the whole 2:30 continuous shot?
This video was entirely generated locally on an NVIDIA RTX 5060 Ti 16GB. To avoid frustrating Out Of Memory (OOM) errors during heavy VRAM rendering in ComfyUI, the maximum safe generation length is about 25 seconds. To ensure zero artifacting and play it safe, the lyrics were broken down into strict 18 to 21-second chunks (scenes). It is all about smart resource management for mid-range local hardware!

🎵 Music Origins & Copyright
The lyrics for "Mr. Sawed-Off Leather Face" are 100% original, written and generated by Anthony two years ago using Suno AI. This is completely copyright-free and an original LiquidMind Arts creation.

⚙️ Tools Used:

Video Generation: LTX-Video 2.3 (LTX 2.3)

Environment: ComfyUI (Local Workflow from RUNE XX)

Music Generation: Suno

Starring: Anthony & Angie

https://www.youtube.com/watch?v=0zLSb3tucuo

Liquidmind111 changed discussion title from RUNE XX WORKLOWS are the most amazing thing that has happened to me - Check out this video guys! to RUNE XX WORKFLOWS are the most amazing thing that has happened to me - Check out this video guys! 15 days ago