How long can I make the music?

Clip length is set in EmptyAceStep1.5LatentAudio (often via a PrimitiveInt for seconds). Longer durations require more VRAM and time. If you hit out-of-memory errors, reduce duration, steps, or batch size.

What sampler settings should I start with?

A practical starting point is 30–60 steps with CFG 3–6 and denoise 1.0. Keep ModelSamplingAuraFlow connected so the sampler schedule matches the model. Increase steps for finer detail; moderate CFG to avoid artifacts.

How do I reduce artifacts or muddiness?

Use clear, style-specific prompts (genre, instruments, mood, tempo). Keep CFG moderate, try a different seed, and raise steps slightly. A short negative prompt (e.g., "no distortion, no clipping") can help; ConditioningZeroOut handles empty negatives safely.

Where is the audio saved and can I change the format?

SaveAudioMP3 writes an MP3 to your ComfyUI output directory. If you prefer WAV, replace or add a compatible WAV save node (if installed) after VAEDecodeAudio and route the decoded audio there.

ACE-Step 1.5XL Base: Text to Music

This workflow turns text prompts into music using the ACE-Step 1.5 XL Base (4B) model inside ComfyUI. It pairs UNETLoader with the acestep_v1.5_xl_base_bf16.safetensors diffusion model and VAELoader with the ace_1.5_vae.safetensors decoder. Your prompt is encoded by TextEncodeAceStepAudio1.5, an EmptyAceStep1.5LatentAudio node creates a blank latent audio clip at your chosen duration, and KSampler performs the denoising pass to synthesize music. VAEDecodeAudio reconstructs the waveform from latents, and SaveAudioMP3 writes the final track to disk.

Under the hood, ConditioningZeroOut provides a clean fallback when the negative prompt is empty, while ModelSamplingAuraFlow configures the model’s sampler to the correct flow/schedule so KSampler can produce stable, on-style results. PrimitiveNode and PrimitiveInt nodes supply simple controls for duration, steps, guidance (CFG), and seed. The workflow is organized into clear groups (Model, Duration, Prompt) so you can quickly load the right weights, set the clip length, write a prompt, and iterate rapidly by adjusting steps, CFG, and seed.

FAQ

Frequently Asked Questions