
This workflow demonstrates how to use Qwen3.5 inside ComfyUI to analyze an image and generate descriptive text that doubles as ready-to-use prompts. It performs both image captioning and reverse prompt engineering: you provide an image via LoadImage, and the TextGenerate node, powered by Qwen3.5, returns structured descriptions or prompt candidates you can paste into your image-generation pipelines.
Technically, the CLIPLoader node is pointed at the Qwen3.5 weights (qwen3.5_4b_bf16.safetensors) stored under models/text_encoders/. That model handle feeds into the TextGenerate node, which accepts the loaded image and an instruction prompt (for example, "Produce 3 concise, diffusion-ready prompts"). The node then runs inference and returns text, which you can view with PreviewAny. A MarkdownNote in the graph provides inline guidance and prompt tips, making it easy to iterate on instruction wording, temperature, and token length to dial in results.