RUNE XX WORKFLOWS are the most amazing thing that has happened to me - Check out this video guys!
RUNE, THANKS TO YOU, I managed to make one very difficult experiment:
🎵 Mr. Sawed-Off Leather Face (LiquidMind Mix) | LTX 2.3 AI Music Video & Dual Lip Sync
This project is a massive technical experiment pushing the boundaries of local AI video generation. "Mr. Sawed-Off Leather Face (LiquidMind Mix)" features a seamless 2:30-minute dual-vocal track, and we successfully achieved perfect phonetic lip-sync for both singers using a single, uncut audio input.
If you are an AI filmmaker or ComfyUI user, you know how hard it is to get an AI video model to switch speakers mid-clip. Here is a deep dive into how this was made!
🛠️ The Dual-Vocal Technical Flex
Most AI music videos separate male and female vocals into different audio tracks to render lip-sync accurately. For this experiment, the entire 2:30 audio track was kept whole. We engineered the prompts in LTX 2.3 to force the model to recognize the mid-clip vocal hand-offs between Anthony and Angie. The AI had to seamlessly transition the active singing animation from one character to the other while keeping the listening character's micro-expressions completely natural.
🧠 Prompt Engineering Secrets: The "Anchor + Action Sequence"
To maintain strict temporal consistency and facial coherence across both subjects, we utilized a highly detailed narrative prompting method:
The Anchor: Every single prompt started with an identical, rock-solid base description of the environment, lighting, and exact clothing/positions of the characters.
The Action Sequence: We injected the specific lyrics directly into the prompt alongside a chronological cue (e.g., "As the audio transitions, the man takes over the vocals..."). This gave LTX a precise roadmap to keep the living room stable while shifting the singing action.
💻 Hardware Constraints & Scene Workflow (Why the Cuts?)
You might notice the scenes cut every 20 seconds. Why not render the whole 2:30 continuous shot?
This video was entirely generated locally on an NVIDIA RTX 5060 Ti 16GB. To avoid frustrating Out Of Memory (OOM) errors during heavy VRAM rendering in ComfyUI, the maximum safe generation length is about 25 seconds. To ensure zero artifacting and play it safe, the lyrics were broken down into strict 18 to 21-second chunks (scenes). It is all about smart resource management for mid-range local hardware!
🎵 Music Origins & Copyright
The lyrics for "Mr. Sawed-Off Leather Face" are 100% original, written and generated by Anthony two years ago using Suno AI. This is completely copyright-free and an original LiquidMind Arts creation.
⚙️ Tools Used:
Video Generation: LTX-Video 2.3 (LTX 2.3)
Environment: ComfyUI (Local Workflow from RUNE XX)
Music Generation: Suno
Starring: Anthony & Angie
That looks great ;-) nice video .. well done ;-)
And i recognize the channel, I seen some of your videos in the past as well ;-)
And happy to hear you got some nice stuff out of it
That looks great ;-) nice video .. well done ;-)
And i recognize the channel, I seen some of your videos in the past as well ;-)
And happy to hear you got some nice stuff out of it
Hey man, haha thats great!! and BTW, here is my TANGO result from yesterday!!!!!
IN THE VIDEO, something happens with my eyes, not sure how to fix it, and her legs trespass one another on a movement, lol.......
Try the SD pose. SD pose is a little less dense dotted for eyes, so that might work better..
And you can also try turn off face detection (both in DW pose and SD pose)
I saw this also a few times, but turning off the face pose should give the model the freedom it might need ;-)
Try the SD pose. SD pose is a little less dense dotted for eyes, so that might work better..
And you can also try turn off face detection (both in DW pose and SD pose)
I saw this also a few times, but turning off the face pose should give the model the freedom it might need ;-)
should i enable DEPTH on BLEND pose and DEPTH and or on DW pose workflow, enable DEPTH?
should i enable DEPTH on BLEND pose and DEPTH and or on DW pose workflow, enable DEPTH?
Dont need the depth. If added, it will also grab some of the structure of the input video. And this can be or not be wanted (like for example lifting up a ball, or some background mountains etc that you want to try influence the video with in addition to the pose, or clothing structure etc etc )
So its more a creative choice. Pose only is the most "safe" though ;-)
should i enable DEPTH on BLEND pose and DEPTH and or on DW pose workflow, enable DEPTH?
Dont need the depth. If added, it will also grab some of the structure of the input video. And this can be or not be wanted (like for example lifting up a ball, or some background mountains etc that you want to try influence the video with in addition to the pose, or clothing structure etc etc )
So its more a creative choice. Pose only is the most "safe" though ;-)
got it...... and the POSE STRENGHT, i should change it when and why? not the LORA strengh, i mean the POSE strenght......
got it...... and the POSE STRENGHT, i should change it when and why? not the LORA strengh, i mean the POSE strenght......
You can always try lower. For example 0.7.
It will give the model a bit more freedom, and might be a more natural looking result
got it...... and the POSE STRENGHT, i should change it when and why? not the LORA strengh, i mean the POSE strenght......
You can always try lower. For example 0.7.
It will give the model a bit more freedom, and might be a more natural looking result
hey man, even with face disabled, my face changes a lot in the end result.... looks weird, look at my image and the video.... eyes and teeth look funny, maybe the LORA is too high? is at 0.7......
PROMPT: "A man with short dark hair with some gray, well-groomed beard and mustache, wearing black-rimmed glasses, blue and orange varsity-style jacket with stripes over black t-shirt, blue pants, dancing smooth Amapiano groove to the rhythm, sensual body movements, relaxed hip sways, shoulder rolls, precise hand gestures pointing and flowing with the beat, confident expression with slight smile, looking towards camera, natural weight shifts and footwork, fluid rhythmic motion, medium portrait shot from waist up showing full upper body and leg movements, realistic skin texture, sharp details, cinematic lighting, dance studio background"
You got moves ;-) hehe
Looks a bit stretched. What width height did you set?
will try with your input image, and see if the auto adjust to x32 (that LTX demands on the masking) is set to stretch inside of crop somewhere
(unless you set the image resize to stretch, it looks far more natural with crop ;-)
You got moves ;-) hehe
Looks a bit stretched. What width height did you set?
will try with your input image, and see if the auto adjust to x32 (that LTX demands on the masking) is set to stretch inside of crop somewhere
(unless you set the image resize to stretch, it looks far more natural with crop ;-)
I SET ALL AT DEFAULT AS YOU MADE IT..... ALSO, QUESTION, WHEN LODING A GLOBAL LORA LIKE THE IC DETAILER ON A WORKFLOW WITH 2 PASSES, DOES IT LOADS BOTH ON PASS ONE AND PASS 2? OR JUST PASS ONE?
the IC lora detailer is really for restoring old videos, if I understood it correctly.
Made for the workflow LTX had for that purpose. Take a grainy low quality original video and make it look better with their for that purpose made looping sampler.
Not sure its a good idea to use outside of that, but i never tried ;-)
That being said, yes the lora runs on both passes. So if you want a lora in just one of the samplers you can add a Lora Loader node and connect to between the main models and the Guider node in the 1st sampler (guessing thats where you might want it if you dont want it in both). The 2nd pass is just 3-4 steps so I dont think it matters much
As for the face (identity), try different seed also. LTX isn't really the best at keeping ID consistent.
And if you know the character (yourself, celebrity etc), its easier to spot. If its just a random AI generated image of someone unknown, it can often be quite unnoticeable that things do change a little bit
Since you make many videos of yourself and wife/girlfriend/friend, might be worth considering making a character lora for each.
Its easier than it sounds. With Otris AI Toolkit
You can train either with video clips: https://www.youtube.com/watch?v=JQIl8DFTL1M
or using just images : https://www.youtube.com/watch?v=oJdT5dzrNEY
(despite the long "boring" videos, its not that much you would need to "learn". Its quite straight forwards)
As for the face (identity), try different seed also. LTX isn't really the best at keeping ID consistent.
And if you know the character (yourself, celebrity etc), its easier to spot. If its just a random AI generated image of someone unknown, it can often be quite unnoticeable that things do change a little bitSince you make many videos of yourself and wife/girlfriend/friend, might be worth considering making a character lora for each.
Its easier than it sounds. With Otris AI ToolkitYou can train either with video clips: https://www.youtube.com/watch?v=JQIl8DFTL1M
or using just images : https://www.youtube.com/watch?v=oJdT5dzrNEY(despite the long "boring" videos, its not that much you would need to "learn". Its quite straight forwards)
hey man, thanks for the links, but for now im more concern on something,,, in the DW pose workflow, and i guess also on the SD pose, if i choose a perfect 9:16 aspect ratio resolution like 864x1536, the result will be 768x1536... and if i do 720x1280, the result would be 640x1280...
how can i avoid this and make do the resolution im actaully using? this is the reason you mentioned my video was TOO LONG or TOO TALL.... is not me, is the "divisible by..." thing, and i cant find it on the DW POSE WF....... how do i fix this buddy?
yes the LTX model only accepts resolutions where height and width can be divided by 32.
So there are a few nodes resizing and then making sure the resize is divisible by 32 + the DW pose accepting 512px and SD pose accepting 1024 px.. so it gets a bit complicated to not run into x32 problems (for regular workflows its no issue, LTX auto adjusts, but for the mask it just fails if its not divisible by 32)
Will check if there are tweaks to make the final output closer to the input size, perhaps by using less resize nodes, and only do at the very end before giving the mask to LTX
yes the LTX model only accepts resolutions where height and width can be divided by 32.
So there are a few nodes resizing and then making sure the resize is divisible by 32 + the DW pose accepting 512px and SD pose accepting 1024 px.. so it gets a bit complicated to not run into x32 problems (for regular workflows its no issue, LTX auto adjusts, but for the mask it just fails if its not divisible by 32)Will check if there are tweaks to make the final output closer to the input size, perhaps by using less resize nodes, and only do at the very end before giving the mask to LTX
hey man, on another topic, all your NEWEST FLF workflows are giving me this error mate......
any idea?
hmmm thats strange. Will check, perhaps something went wrong when i updated them not that long ago
Checked the first last frame workflow and they were all ok.
So something must have happened your end.
Check under the second input image resize node. That its connected to a set last frame node
hey man, sorry, but i downloaded the WF again, the one that says TRNASITION LORA..... and still get this.....
and i saw your image here, and doesnt seem to see anything BROKEN, i did updated my comfy 3 days ago, but everything else works fine... older WF are fine..... can you just UPLOAD some of your newest WF again so i can test them? doesnt make sense....
Hmm that is so strange. The Set lastframe node is right there in your screenshot. And yet it still says "no set node found
Try update KJNodes since the set and get nodes had a lot of updates recently
Ah wait, you said transition lora. I didnt double check that one.. will do ;-)
(but probably wise to update KJNodes regardless... because i do see the setnode in your screenshot, and Comfy complaining its not there)
Ah wait, you said transition lora. I didnt double check that one.. will do ;-)
(but probably wise to update KJNodes regardless... because i do see the setnode in your screenshot, and Comfy complaining its not there)
yes i updated comfy but not the nodes, lol, ok will do!!! but also, in the working FLF WF, at the end, let say i use this image as last frame, it will actually show another person and will add things to my surroundings, not show the real new york image i have, i even had to use WAN FLF to make it workable.... why is the FLF changing the faces a lot, like to another person?
Ah wait, you said transition lora. I didnt double check that one.. will do ;-)
(but probably wise to update KJNodes regardless... because i do see the setnode in your screenshot, and Comfy complaining its not there)
This one too, if i use this as last frame, it will show me with another face,,, SIMILAR, but not me, or maybe change the shirt, add bottons, etc.... smiling, open mouth, etc... even with the transition LORA...
here is my face in the image, and a video result, LOL, not me......
Using gguf gemma model?
If so turn off "Prompt Enhancer" at the bottom of the workflow
Using gguf gemma model?
If so turn off "Prompt Enhancer" at the bottom of the workflow
no, i dont use GGUF, and prompt enhancer was off, and also on another WF the VHS video gave errors when rendering the video, i DELETED comfy and using the latest working from a month ago, lol, not gonna fight some errors i didnt had before......
Updated workflows incoming ;-) It missed an important part in the upscale section .. must have been sleepy when i updated these last time, more coffee and fixed the obvious error ;-)
All the frames to frames workflows will have an update asap
Using gguf gemma model?
If so turn off "Prompt Enhancer" at the bottom of the workflowno, i dont use GGUF, and prompt enhancer was off, and also on another WF the VHS video gave errors when rendering the video, i DELETED comfy and using the latest working from a month ago, lol, not gonna fight some errors i didnt had before......
HELL YEAH!!! let me know!!!!! pplease, make it work without making me to have to update comfy again, since im using, a roll back version, lol
Updated workflows incoming ;-) It missed an important part in the upscale section .. must have been sleepy when i updated these last time, more coffee and fixed the obvious error ;-)
All the frames to frames workflows will have an update asap
i see you uploaded one WF alrady, does that one has also custom audio already?
in process of updating the Frame workflows. Just double checking that there arent any errors ;-)
You probably need to update KJNodes







