Seedance 2.0: Text to Video - ComfyUI Workflow

Seedance 2.0: Text to Video turns plain-language prompts into cinematic video clips inside ComfyUI. The workflow is built around ByteDance2TextToVideoNode running the Seedance 2.0 model to synthesize a coherent sequence of frames directly from text. You control duration (frame count), playback speed (fps), resolution, and seed, which together determine the look, timing, and repeatability of the output. SaveVideo then encodes the generated frames into a single video file (e.g., MP4/WebM) at your chosen frame rate.

Technically, ByteDance2TextToVideoNode conditions a video diffusion process on your prompt and iteratively produces a temporally consistent stack of images. Fixing the seed and keeping motion cues clear in the prompt helps maintain stable visuals across frames. Describing camera-like behavior in the text (e.g., “slow dolly in,” “static tripod,” “handheld”) guides motion without extra nodes. SaveVideo collects the node’s frame output and writes it to disk; if you need music or sound effects, export your video and sync audio later in a video editor.