Retour

This ComfyUI workflow builds seamless, text-guided transitions between two source clips using the LTX 2.3 video diffusion model and the SYSTMS FLW transition LoRA. The MODELS group loads the transformer-only LTX 2.3 backbone via DiffusionModelLoaderKJ, the matching VAE with VAELoaderKJ, and applies the FLW LoRA using LoraLoaderModelOnly to bias the model toward natural, flow-like shot changes. Prompts are encoded with CLIPTextEncode (optionally auto-generated by a Gemini node), then routed through LTXVConditioning and LTXVAddGuide to steer the transition content and style.

Under the hood, the workflow allocates a timeline in EmptyLTXVLatentVideo, manages A/V latent streams with LTXVSeparateAVLatent and LTXVConcatAVLatent, and localizes denoising to the handoff region using SolidMask and SetLatentNoiseMask. Sampling is handled by SamplerCustomAdvanced with CFGGuider, KSamplerSelect, BasicScheduler, and RandomNoise for reproducibility. After inference, VAEDecode reconstructs frames and VHS_VideoCombine assembles them into a final video. LTXVCropGuides provides compositional guides to keep subjects aligned between shots, helping the model generate cleaner, more believable morphs.