Simple ComfyUI Z Image Turbo Workflow: Generate, Control & Upscale AI Images FAST
This guide walks you through a simple yet powerful ComfyUI workflow using the Z Image Turbo model. You'll learn how to generate high-quality realistic images at blazing speed, add control with ControlNet, and upscale your results while preserving detail. The workflow is designed to be low-end GPU friendly—Z Turbo is a 6B parameter model that runs fast even on modest hardware.
What You'll Learn
Prerequisites
Before you start, make sure you have:
- ComfyUI installed – see our ComfyUI Installation Guide for complete setup.
- Z Image Turbo model downloaded and placed in the correct models folder.
- ControlNet models (Canny, Pose, Depth) installed.
- Custom nodes for the workflow (e.g., LoRA loader, ControlNet loader). Our Installation Guide covers custom nodes installation.
- Optional: LoRAs you want to use – filter for Z Turbo LoRAs on Civitai.
If your hardware isn't powerful enough or you want to speed up generations, consider using RunPod's cloud GPU service.
Special Offer - $5 Credit Included!
When you sign up for RunPod using our affiliate link, you'll receive a $5 credit that can be used to generate up to 9,000 images and 300 videos. This gives you plenty of resources to explore ComfyUI and AI image/video generation without any upfront cost!
Step-by-Step Process
Step 1: Understand the Workflow Layout
The workflow is divided into three main areas:
- Core Generation (gray/green nodes) – handles model loading, prompt, LoRA, sampling, and image preview.
- ControlNet Section – optionally guides generation with edge, pose, and depth maps.
- Image-to-Image / Upscaling – reuses the same load image node for upscaling from an existing image.
Color Coding Guide
- Gray – Leave these nodes as-is (they rarely change).
- Green – You will usually change these (prompt, seed, etc.).
- Black – Some selection may be needed.
- Brown – Tweak these values (e.g., denoise strength, ControlNet strengths).
Step 2: Set Up the Core Generation (Without ControlNet)
- Load the model – The workflow loads Z Image Turbo, CLIP, and VAE. These are gray nodes; leave them untouched.
- Add LoRAs – Use the LoRA loader node to load multiple LoRAs with individual strength values. (Brown nodes; adjust strengths as needed.)
- Enter your prompt – The CLIP text encode node takes your positive prompt. The negative prompt is unused by Z Turbo (minimize it).
- Set image dimensions – Use the load image / empty latent node. For new generation, set the width and height directly. For upscaling, this node will be fed from a saved image.
- Configure the KSampler – This is where the magic happens. For a fresh generation, set denoise = 1.0 (full denoise). The sampler will ignore any loaded image and generate from scratch.
- Bypass ControlNet – Initially, bypass the ControlNet group (right-click → Bypass Group) so we test pure prompt generation.
- Generate – Click Queue and watch the fast result (Z Turbo runs in seconds even on modest GPUs).
Pro Tip
Z Turbo’s 6B parameter size makes it incredibly fast – you can generate multiple images in the time it takes other models to produce one. Experiment with different seeds to get varied results.
Why Negative Prompt is Unused
Z Turbo does not use a negative prompt. The transcript shows it minimized to keep the workflow clean.
Result without ControlNet: You get a unique image based solely on your prompt and LoRAs. Changing the seed yields a completely different image.
Step 3: Add ControlNet for Guided Generation
- Re-enable the ControlNet group – Set it to Always (right-click group → Bypass Group toggles).
- Load a reference image – Use the same Load Image node (or a separate one) as a ControlNet reference.
- Select ControlNet models – The workflow includes three preprocessors:
- Canny – Edge detection
- Pose – Human pose skeleton
- Depth – Depth map (3D structure)
- Adjust per-control strength – Each control has a strength slider (0 to 1). Set to 0 to ignore a control, 1 for full influence. Example: ignore Canny and Depth (set to 0) but use Pose (set to 0.8) to keep the same body position while changing appearance.
- Generate – With ControlNet active, the output will follow the reference image’s structure based on the enabled controls.
Pro Tip
For best results, choose controls that match your goal. Pose works well for human figures, Canny for preserving edges, and Depth for scene composition. You can combine them with different strengths.
LoRA Compatibility During ControlNet
When using ControlNet, your LoRAs still influence the output, but the controls guide the structure. If you want the structure to dominate, reduce LoRA strengths or disable them.
Step 4: Upscale Your Image with Image-to-Image Workflow
- Save the image you like from the previous step.
- Reload that saved image into the Load Image node (the same node used for ControlNet reference in step 3).
- Switch the size node – Instead of the default resolution path, select the upscale path (scale factor 2x).
- Disable LoRAs – For upscaling, LoRAs can impose unwanted styles. It’s best to set their strengths to 0 or bypass them.
- Set denoise to 0.4 – This low denoise keeps the original image mostly intact while adding new details (skin pores, hair strands, textures). Higher denoise (e.g., 0.6) will change the image more.
- Configure ControlNet for upscaling – Use the same image as the ControlNet reference. Enable Canny and Depth controls (set strength around 0.8–1.0) to force the upscaled version to retain the original’s shape and 3D structure. Disable Pose (strength 0) since the pose is already correct.
- Generate – The upscaling step takes a bit longer (because it processes a larger latent). Once done, compare side‑by‑side: the upscaled version is much sharper with visible pores, hair detail, and skin texture.
Why Denoise 0.4 Works for Upscaling
Denoising at 0.4 allows the KSampler to add new high-frequency details while preserving the original image’s identity. The latent upscaling expands the image and then refines it, guided by the ControlNet maps.
Pro Tip
If the output has too many changes (e.g., new objects appear), lower the denoise to 0.3 or 0.2. If you want more creative variation, raise it to 0.5 or 0.6.
ControlNet Preprocessor Settings
For dark images, the Canny preprocessor may detect very few edges. You can adjust the Canny threshold values in the ControlNet node to increase edge detection. For brighter images, the preprocessors work better out of the box.
Step 5: Review Examples and Optimization Tips
The transcript shows two examples:
- Dark image upscale – Low denoise (0.4) with only depth control active; the result sharpens but minimal change.
- Bright portrait upscale – Same settings but with better preprocessing; output has much sharper hair, eyes, skin texture while retaining likeness.
Mastering the Workflow
By adjusting denoise, ControlNet strengths, and LoRA presence, you can balance between preserving the original and adding detail. Experiment to find the sweet spot for your images.
Pro Tip
Always bypass ControlNet when doing pure text‑to‑image, then re‑enable for upscaling or when you want structure guidance. This keeps your workflow flexible.
Related Guides
- ComfyUI Installation Guide – Full setup, custom nodes, and workflow loading.
- Running ComfyUI on RunPod – Cloud GPU setup for faster generation and upscaling.
- Deploy Your First RunPod – Basic RunPod account setup and pod management.
Next Steps
Now that you’ve mastered the Z Image Turbo workflow, try:
- Experiment with different LoRAs from Civitai to add styles (e.g., anime, realistic textures).
- Combine multiple ControlNet units with varying strengths for complex guidance.
- Upscale further using 4x scale (if your GPU can handle it) with denoise 0.3–0.4.
- Integrate a VAE decode step after upscaling to save the final high‑resolution image.
Happy generating!
