OmniGen2: Text to Image - ComfyUI Workflow

This workflow demonstrates a minimal, fast path from text prompt to finished image using OmniGen2’s unified 7B multimodal model. A single OmniGen2 node (custom node id 4b17d220-4312-4981-9eae-9a76bf3b6ec9) ingests your prompt, encodes it with the Qwen2.5-VL text encoder, and produces a high-quality image without the complexity of diffusion samplers or long node chains. The output is then passed to SaveImage for consistent file naming and export, while a MarkdownNote provides in-canvas guidance and links to the official docs.

Technically, the prompt is embedded via the qwen_2.5_vl_fp16.safetensors text encoder and routed through OmniGen2’s dual-path architecture, which unifies language understanding with image generation in a single model. The node exposes practical controls—such as resolution, seed, and batch/count—so you can tune aspect ratio and reproducibility. The entire setup can be wrapped as a Subgraph for a clean UI: expand it any time to reveal advanced parameters and adjust defaults. This makes the workflow ideal for quick iteration, predictable outputs, and easy handoff to teammates.