How to Install ComfyUI + Nunchaku FLUX.1-dev - Lightning Fast AI Image Generation

Taehyeong LeeTaehyeong Lee
8 min read

Introduction

  • ComfyUI + Nunchaku FLUX.1-dev represents a breakthrough in AI image generation performance. By combining ComfyUI's node-based workflow interface with MIT Han Lab's revolutionary SVDQuant 4-bit quantization technology, this setup delivers 3.0× speedups and 3.6× memory reduction compared to standard FLUX.1-dev implementations. In my testing on Windows 11 + RTX 3080 10GB, image generation times dropped from 40+ seconds to around 11-12 seconds while maintaining exceptional quality. This makes Nunchaku FLUX.1-dev one of the most practical solutions for local AI image generation in 2025.

Features

  • Revolutionary Performance: SVDQuant's 4-bit quantization delivers 3.0× speedups over NF4 W4A16 baseline while maintaining visual fidelity
  • Memory Efficiency: 3.6× memory reduction enables 12B FLUX.1-dev to run comfortably on 8GB+ RTX cards without CPU offloading
  • Easy Installation: Unlike traditional quantization methods requiring hours of compilation, Nunchaku provides pre-built wheels for instant deployment
  • Broad GPU Compatibility: Native support for RTX 20xx, 30xx, 40xx, and 50xx series cards through optimized CUDA kernels
  • Professional Workflow Integration: Seamless ComfyUI integration with LoRA, ControlNet, and multi-model support
  • Production-Ready Stability: ICLR 2025 Spotlight paper backing ensures academic rigor and reliability

Prerequisites

  • Operating System: Windows 11 (tested) or Windows 10 with latest updates
  • GPU: NVIDIA RTX series with 8GB+ VRAM (10GB+ recommended for FLUX.1-dev)
  • System RAM: 16GB minimum, 32GB recommended
  • Storage: 15GB+ free space for models and dependencies
  • Python: Python 3.12 recommended (ComfyUI Desktop handles this automatically)

Installing ComfyUI Desktop

  • ComfyUI Desktop provides the most streamlined installation experience, eliminating Python environment management complexities. [Download Link]

Essential File Downloads

Installing ComfyUI-nunchaku Plugin

  • The Nunchaku plugin provides essential nodes for 4-bit quantized model loading and inference.
Run [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ [ComfyUI-nunchaku] (Check)
→ [Install]
→ Restart [ComfyUI]

Installing Nunchaku Backend

  • This step installs the actual quantization engine that powers the performance improvements.
Run [ComfyUI]
→ [Workflow]
→ [Open]
→ install_wheel.json (Double Click)
→ [Nunchanku Wheel Installer] (Click)
→ version: [v0.3.1] (Select)
→ [Preview Any] (Click)
→ [▷ Execute] (Click)
→ Wait for confirmation: "Successfully installed nunchaku..."
→ Restart [ComfyUI]

[Advanced] Manual Nunchaku Backend Installation

  • For users requiring manual control or troubleshooting installation issues:
# Open PowerShell as Administrator
# Navigate to ComfyUI directory
PS> cd .\ComfyUI\
PS> .\.venv\Scripts\Activate.ps1

# Install Nunchaku dependencies
PS> pip install -r custom_nodes\ComfyUI-nunchaku\requirements.txt
PS> pip install nunchaku --upgrade

# Install additional dependencies if needed
PS> pip install facexlib insightface onnxruntime

# Verify installation
PS> python -c "import nunchaku; print(nunchaku.__version__)"

Running Your First Nunchaku FLUX.1-dev Generation

Run [ComfyUI]
→ [Workflow]
→ [Open]
→ nunchaku-flux.1-dev.json (select)
→ Set your prompt in the text input node
→ [▷ Run]
  • I applied the following additional configurations to the example workflow provided by Nunchaku and conducted multiple image generation tests. The test results confirmed very fast image generation averaging 11-12 seconds with high quality output.
Nunchaku Flux DiT Loader
* model_path: [svdq-int4_r32-flux.1-dev.safetensors] # INT4 quantized model
* cache_threshold: 0
# Performance optimization with FP16 attention
* attention: [nunchaku-fp16]
# Mixed precision computation
* data_type: [bfloat16]

Nunchaku Flux.1 LoRA Loader
# Speed enhancement, high-quality generation with fewer steps
* lora_name: [flux-1.turbo-alpha.safetensors]
* lora_strength: 1.0

Nunchaku Flux.1 LoRA Loader
# Enhanced realistic human representation
* lora_name: [flux_realism_lora.safetensors]
* lora_strength: 0.7

Nunchaku Text Encoder Loader
* text_encoder1: [t5xxl_fp16.safetensors]
* text_encoder2: [clip_l.safetensors]

FluxGuidance
# Balance between prompt adherence and creativity
# Values below [5] cause watercolor effects due to under-guidance artifacts.
* guidance: 5

BasicScheduler
# Stable noise reduction
# [beta] scheduler removes noise more efficiently at beginning/end steps, preserving high-frequency details vs [simple] scheduler
* scheduler: [beta]
# Low-step generation enabled by Turbo LoRA
* steps: 8

Multiply Sigmas
# Fine-tuning sigma values for detail enhancement
* factor: 0.960
* start: 0.950
* end: 0.980

Width:
* value: 896

Height
* value: 1152

[Tip] Multiply Sigmas: Maximizing Detail in Mechanical and Portrait Generation

  • Multiply Sigmas functions as an independent node in ComfyUI that significantly enhances detail quality in mechanical objects and portraits, effectively reducing the characteristic AI-generated appearance. [Related Link]
  • The most recommended configuration is: Guidance: 4.5 + Scheduler: Beta + Multiply Sigmas: 0.96.
  • This feature becomes available after installing the ComfyUI-Detail-Daemon custom node package in ComfyUI.
# Installing [ComfyUI-Detail-Daemon]
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [ComfyUI-Detail-Daemon]
→ [Install]
→ Restart [ComfyUI]
  • After installation, you can add the Multiply Sigmas node to your workflow as follows:
# [1] Adding [Multiply Sigmas] node to workflow
(Right-click on empty space in workflow canvas)
→ [Add Node]
→ [sampling]
→ [custom_sampling]
→ [sigmas]
→ [Multiply Sigmas (stateless)]
→ factor: 0.96
→ start: 0.95
→ end: 0.98

# [2] Connect [BasicScheduler]'s SIGMAS output to [Multiply Sigmas] input
# [3] Connect [Multiply Sigmas] output to [SamplerCustomAdvanced]'s sigmas input

# Correct Node Connection Sequence
# [BasicScheduler] → [Multiply Sigmas] → [SamplerCustomAdvanced]

[Tip] Face Detailer: Maximizing Facial Detail Enhancement for Characters

  • Face Detailer is a powerful feature that detects and enhances facial details in generated images. This is particularly useful for full-body character shots where facial details tend to be significantly degraded. Face Detailer helps maintain and improve these crucial details.
  • This feature becomes available after installing both the ComfyUI Impact Pack and ComfyUI Impact Subpack custom node packages in ComfyUI.
# Installing [ComfyUI Impack Pack] and [ComfyUI Impack Subpack]
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [ComfyUI Impack Pack]
→ [Install]
→ Search [ComfyUI Impack Subpack]
→ [Install]
→ Restart [ComfyUI]
  • After installation, you can add the FaceDetailer node to your workflow as follows:
# Adding [FaceDetailer] node to workflow
(Right-click on empty space in workflow canvas)
→ [Add Node]
→ [ImpactPack]
→ [FaceDetailer]

# Recommended parameters for [Nunchaku FLUX.1-dev]
→ guide_size: 512
→ guide_size_for: [crop_region]
→ max_size: 1024
→ steps: 8
→ cfg: 1.0
→ sampler_name: [euler]
→ scheduler: [beta]
→ denoise: 0.50
→ feather: 5
→ drop_size: 10

# Adding [CLIP Text Encode (Negative Prompt)] node to workflow and type below text
low quality, blurry, bad anatomy, worst quality, low resolution, heavy makeup, rough skin, harsh texture, skin imperfections, overly detailed skin, artificial skin, dirty skin, skin imperfections, acne, blackheads, wrinkles, aged skin, damaged skin, oily skin, uneven skin tone, overly detailed skin, harsh skin texture, artificial skin, large pores, visible pores, textured skin, coarse skin, bumpy skin, weathered skin, leathery skin, sun damaged skin, scarred skin, blemished skin, unsmooth skin, grainy skin, patchy skin, peach fuzz, vellus hair

[Tip] res_2s + bong_tangent: Superior Image Generation with Advanced Sampling

  • Sampler res_2s combined with Scheduler bong_tangent delivers the highest quality image generation. [Related Link]
  • Technical Details:
    • res_2s: Uses 2-stage substeps per step, requiring two model calls per step (slower but higher quality than single-stage samplers)
    • bong_tangent: BONGMATH technology enables bidirectional denoising, processing both forward and backward simultaneously for more accurate sampling
  • These features are available by installing the RES4LYF custom node package in ComfyUI.)
# Installing [RES4LYF]
Launch [ComfyUI]
→ [Manager]
→ [Custom Nodes Manager]
→ Search [RES4LYF]
→ [Install]
→ Restart [ComfyUI]
  • Once installed, you can configure them in KSamplerSelect and BasicScheduler as follows:
KSamplerSelect
# Performs Multistage Sampling (RES Multistage Exponential Integrator)
* sampler_name: [res_2s]

BasicScheduler
# Performs bidirectional denoising (BONGMATH Technology)
* scheduler: [bong_tangent]
* steps: 8
* denoise: 1.00

[Tip] FLUX.1-Krea-dev Best Practices & Optimization

  • FLUX.1-Krea-dev is a collaborative model released by Black Forest Labs and Krea AI, featuring an opinionated aesthetic philosophy that emphasizes natural texture, realistic tone, and enhanced detail rendering to completely eliminate the characteristic AI look of FLUX models—including plastic-like skin and oversaturation—pursuing extreme photorealism.
  • The model demonstrates improved prompt adherence capabilities compared to the base FLUX.1-dev model. Detailed descriptions of temporal context, color grading, composition, and fine details particularly leverage the model's strengths in natural texture and realistic rendering.
  • Maintains 100% architectural compatibility with FLUX.1-dev as a drop-in replacement. Recommended settings:
    • model: svdq-int4_r32-flux.1-krea.dev.safetensors (Nunchaku version)
    • sampler_name: res_2s
    • scheduler: bong_tangent
    • steps: 8
    • denoise: 1.0
    • guidance: 5.0
    • width x height : 864 x 1152
    • loras:
      • lora_name: Flux_Krea_Blaze_Lora-rank32.safetensors, lora_strength: 1.00
      • lora_name: [your-style-lora], lora_strength: 0.50
      • lora_name: [your-character-lora], lora_strength: 0.50
      • lora_name: SameFace_Fix.safetensors, lora_strength: -0.70

[Tip] FLUX.1-Kontext-dev Best Practices & Optimization

  • Preserve Original Image Size: Set the FluxKontextImageScalenode to Bypass mode to maintain the input image's original dimensions. This node typically scales images to optimal resolutions for FLUX processing (usually under 2.1MP) and reduces VRAM usage, but bypassing it preserves your desired output size.
  • Minimize Facial Changes: Set the denoise strength parameter to 0.85 or lower in the KSampler or BasicScheduler nodes. The default value of 1.0 completely replaces the input image with noise, while lower values preserve more original image characteristics. Values between 0.75-0.85 provide the optimal balance between edit quality and identity preservation.
  • Use Multiple FLUX.1-dev LoRAs: You can load and combine multiple LoRA models trained on the FLUX.1-dev base model. Connect Nunchaku FLUX LoRA Loader nodes to the output of the Nunchaku FLUX DiT Loader node and specify your desired LoRA files.

Personal Note

  • After extensive testing across various hardware configurations, Nunchaku FLUX.1-dev has become my go-to solution for high-quality, fast AI image generation. The combination of academic rigor (ICLR 2025 Spotlight), practical performance gains, and seamless ComfyUI integration makes this the most compelling FLUX.1-dev implementation available in 2025. The 12-20 second generation times on RTX 3080 10GB represent a significant improvement that makes AI image generation genuinely practical for iterative creative workflows.

References

  • https://github.com/mit-han-lab/nunchaku
  • https://hanlab.mit.edu/blog/svdquant
  • https://github.com/mit-han-lab/ComfyUI-nunchaku
  • https://huggingface.co/black-forest-labs/FLUX.1-dev
  • https://docs.comfy.org/
  • https://comfy.icu/extension/mit-han-lab__ComfyUI-nunchaku
  • https://huggingface.co/collections/mit-han-lab/nunchaku-6837e7498f680552f7bbb5ad
  • FLUX.1-Krea & the Rise of Opinionated Models - Drew Breunig
0
Subscribe to my newsletter

Read articles from Taehyeong Lee directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Taehyeong Lee
Taehyeong Lee