Skip to main content
MCP

Local ComfyUI MCP: generate images on your GPU

Tutorial to connect ComfyUI to Claude Code via MCP, generate images locally on your GPU, install Flux Schnell and troubleshoot CUDA/OOM errors fast.

  • Tutorial
  • Tooling
  • Architecture
Published Updated

Why connect ComfyUI to Claude Code

You're working on a project that needs images: mockups, illustrations, style variations, game assets. The usual flow looks something like this: leave your editor, open a Midjourney or Stable Diffusion tab, paste a prompt, wait, download, go back to your terminal. Twenty times in a row.

With a local ComfyUI MCP, Claude Code becomes the only interface you need. Describe what you want in plain language, the MCP translates it into a generation workflow, ComfyUI runs in the background on your GPU, and the image lands in your project folder. All without leaving your coding session.

This article walks you through the whole setup: understanding what happens inside the model, installing ComfyUI with Flux Schnell on your GPU, connecting the MCP to Claude Code, and handling common errors. The tutorial targets Linux/WSL2 with an NVIDIA GPU, but the concepts apply to all systems.


How a diffusion model works

Before running any commands, a quick look under the hood. No math needed, just the concepts that will help you understand why certain parameters exist and why certain errors happen.

Noise as a starting point

A diffusion model does not "draw" an image. It starts from a pure noise image (random pixels, like static on an old TV screen) and progressively denoises it, step by step, until it produces something coherent. The text of your prompt guides each denoising step.

The closest analogy: imagine a sculptor working on a marble block covered in dense fog. The final statue is not yet visible, but with each chisel stroke more shape is revealed, guided by a written description of the target result.

Latent space: the compressed map

The U-Net: the artist doing the denoising

The sampler: the denoising rhythm

The VAE: the image/latent translator

ControlNet: the imposed template


Why ComfyUI over AUTOMATIC1111

Several interfaces exist for running diffusion models locally. The two most common are AUTOMATIC1111 (also called SD WebUI) and ComfyUI. The right choice depends on your use case.

CriterionAUTOMATIC1111ComfyUI
InterfaceForms, tabsNode graph
Learning curveGentleSteeper upfront
FlexibilityLimited to extensionsTotal (you wire it yourself)
ReproducibilityWorkflows hard to shareWorkflows as exportable JSON
Flux supportPartial, via extensionsNative since late 2024
REST APILimitedComplete (/prompt, /queue, /history)
MCP integrationDifficultNatural via the API

ComfyUI stands out for two concrete reasons here. First, its node graph architecture maps exactly to what a diffusion workflow does (each node is one operation, the links are the data flow). Second, its local REST API on port 8188 is exactly what the MCP calls to trigger a generation.


Hardware requirements

The model you run directly determines the VRAM you need. This table is based on official model file sizes:

GPU VRAMCompatible modelsNotes
4 GBSD 1.5 (2 GB), SDXL Turbo fp8Slow generation, limited to 512x512
6 GBSD 1.5, SDXL fp8, partial Flux Schnell fp8Flux Schnell fp8 requires --lowvram
8 GBSDXL, Flux Schnell fp8 (17.2 GB loaded in chunks)Flux Schnell fp8 comfortable with CPU offloading
12 GBFull Flux Schnell fp8, Flux Dev fp8Recommended for daily Flux use
24 GB+Flux Schnell full (23.8 GB), Flux Dev fullMaximum quality, fast generation

Other requirements:

  • NVIDIA GPU with recent drivers (CUDA 12.6+ recommended)
  • Python 3.12 or 3.13 (3.13 recommended per the official ComfyUI docs)
  • Git to clone the repository

Installing ComfyUI and Flux Schnell

1

Clone ComfyUI

git clone https://github.com/Comfy-Org/ComfyUI.git
cd ComfyUI

The official repository now lives under the Comfy-Org organization. The old comfyanonymous/ComfyUI URL redirects to this same repo.

2

Create a Python virtual environment

python3 -m venv venv
source venv/bin/activate # Linux/macOS
# On Windows: venv\Scripts\activate

Always use a dedicated virtualenv. Installing ComfyUI into the system Python regularly breaks other tools.

3

Install PyTorch with CUDA support

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

This command installs PyTorch with CUDA 12.6, the stable version recommended at the time of writing. If your GPU requires CUDA 11.8 (Pascal architecture like the GTX 1000 series, or systems with older drivers that do not support CUDA 12):

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
4

Install ComfyUI dependencies

pip install -r requirements.txt

This installs all the required Python dependencies (transformers, safetensors, Pillow, aiohttp, etc.). Installation takes a few minutes.

5

Download Flux Schnell (fp8 version)

Flux Schnell is available in two variants on Hugging Face. The Comfy-Org/flux1-schnell repository offers an fp8 version optimized specifically for ComfyUI:

  • flux1-schnell-fp8.safetensors: 17.2 GB, recommended for 8-12 GB GPU
  • flux1-schnell.safetensors: 23.8 GB, full precision for 24 GB+ GPU

Download the fp8 version with huggingface-cli:

# Install huggingface-cli if missing
pip install huggingface_hub
# Download the fp8 model directly into the right folder
huggingface-cli download Comfy-Org/flux1-schnell flux1-schnell-fp8.safetensors \
--local-dir ./models/checkpoints/

If you prefer wget (no authentication needed, Apache 2.0 license):

mkdir -p models/checkpoints
wget -O models/checkpoints/flux1-schnell-fp8.safetensors \
https://huggingface.co/Comfy-Org/flux1-schnell/resolve/main/flux1-schnell-fp8.safetensors

The download takes time: 17 GB to transfer.

6

Launch ComfyUI

python main.py

On a GPU with limited VRAM (6-8 GB), add memory optimization flags:

python main.py --lowvram

ComfyUI starts on http://127.0.0.1:8188. The graphical interface is accessible in your browser, but we do not use it directly: it is the REST API behind it that matters.

Verify everything works by opening http://127.0.0.1:8188/system_stats in your browser: you should see a JSON response with GPU and CUDA info.


Connecting a ComfyUI MCP to Claude Code

Several ComfyUI MCP projects exist on GitHub. The lightest for our use case is comfyui-mcp-server by joenorton, an actively maintained Python server that exposes an MCP API and delegates generation to ComfyUI via its REST API on port 8188.

Installing the MCP server

# In a separate folder (not inside ComfyUI)
git clone https://github.com/joenorton/comfyui-mcp-server.git
cd comfyui-mcp-server
# Activate the same venv or create a dedicated one
pip install -r requirements.txt
# Start the MCP server
python server.py

The MCP server listens on http://127.0.0.1:9000/mcp. It expects ComfyUI to already be running on port 8188.

Configuration in Claude Code

Create or update the .mcp.json file at the root of your project:

{
"mcpServers": {
"comfyui": {
"type": "streamable-http",
"url": "http://127.0.0.1:9000/mcp"
}
}
}

What the MCP exposes to Claude

Once connected, Claude Code has access to the following tools:

  • generate_image: generates an image from a text prompt
  • list_models: lists available models in ComfyUI
  • get_queue_status: checks pending or running jobs
  • list_assets: browses previously generated images
  • run_workflow: executes a custom JSON workflow

Alternative: minimal TypeScript wrapper

If you prefer a more direct integration without an intermediary server, here is a minimal TypeScript wrapper that POSTs directly to the ComfyUI API. It uses the same McpServer + server.tool() API as the Build a TypeScript MCP tutorial, to stay consistent with the other site examples.

// comfyui-mcp-wrapper.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const COMFYUI_URL = "http://127.0.0.1:8188";
const server = new McpServer({
name: "comfyui-local",
version: "0.1.0",
});
server.tool(
"generate_image",
"Generate an image via local ComfyUI by POSTing on /prompt",
{
prompt: z.string().describe("Text description of the image"),
steps: z.number().default(4).describe("Number of denoising steps"),
},
async ({ prompt, steps }) => {
const workflow = {
"1": { inputs: { text: prompt, clip: ["2", 0] }, class_type: "CLIPTextEncode" },
"2": { inputs: { ckpt_name: "flux1-schnell-fp8.safetensors" }, class_type: "CheckpointLoaderSimple" },
"3": { inputs: { seed: Math.floor(Math.random() * 1e9), steps, cfg: 1.0, sampler_name: "euler", scheduler: "simple", denoise: 1.0, model: ["2", 0], positive: ["1", 0], negative: ["4", 0], latent_image: ["5", 0] }, class_type: "KSampler" },
"4": { inputs: { text: "", clip: ["2", 1] }, class_type: "CLIPTextEncode" },
"5": { inputs: { width: 1024, height: 1024, batch_size: 1 }, class_type: "EmptyLatentImage" },
"6": { inputs: { samples: ["3", 0], vae: ["2", 2] }, class_type: "VAEDecode" },
"7": { inputs: { filename_prefix: "claude_gen", images: ["6", 0] }, class_type: "SaveImage" },
};
const res = await fetch(`${COMFYUI_URL}/prompt`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ prompt: workflow }),
});
const data = (await res.json()) as { prompt_id: string };
return {
content: [{ type: "text" as const, text: `Image queued. ID: ${data.prompt_id}` }],
};
}
);
async function main() {
const transport = new StdioServerTransport();
await server.connect(transport);
}
main().catch((err) => {
console.error("MCP error:", err);
process.exit(1);
});

To use this wrapper via stdio, the .mcp.json config becomes:

{
"mcpServers": {
"comfyui": {
"command": "npx",
"args": ["ts-node", "comfyui-mcp-wrapper.ts"]
}
}
}

First prompt from Claude Code

Once both servers are running (ComfyUI on 8188, MCP on 9000) and Claude Code has been restarted to load the MCP config, you can test directly in your session:

Generate an image of a robot astronaut looking at Earth from the Moon,
vector illustration style, cyan and amber colors, black background.
Size 1024x1024, 4 steps.

Claude Code calls the generate_image tool with these parameters, the MCP sends the workflow to ComfyUI, and generation starts. On an 8 GB GPU with Flux Schnell fp8 at 4 steps, expect roughly 10 to 30 seconds depending on your card.

The result lands in the ComfyUI/output/ folder with a timestamped filename (ComfyUI_00001_.png or the prefix defined in the workflow). You can ask Claude Code to list generated images with list_assets, or simply open the folder.


Troubleshooting

The four most common errors when getting started with ComfyUI on a GPU.


Next steps

You have ComfyUI running locally and Claude Code that can generate images. The logical next moves:

A detailed local vs cloud comparison (total cost, when a local GPU becomes worth it against Replicate or fal.ai APIs) will round out this series.