Skip to main content
Agents

Build a Claude agent that publishes your visual assets

Tutorial to build a Claude SDK agent that generates images via Flux/Replicate, optimizes to WebP and uploads. Architecture, TypeScript and Python code, cost.

  • Tutorial
  • Architecture
  • Tooling
Published Updated

You want to automate image generation for your blog, social media, or content pipeline? This article shows you how to build a complete Claude SDK agent: it receives a request in natural language, generates an image via the Replicate/Flux API, optimizes it to WebP, then publishes it to your storage. Everything orchestrated by Claude with three simple tools.

Why prefer the cloud for image generation

Local image generation (via ComfyUI or Automatic1111) has its advantages: you control everything, costs at volume are low, and you can customize models. But it requires a dedicated GPU, an always-on server, and infrastructure to maintain. For the article on local ComfyUI, see Local ComfyUI MCP.

The cloud changes the equation in several situations:

CI/CD and automation. A GitHub Actions or Vercel pipeline has no access to a local GPU. Cloud APIs can be called from any runner, without hardware configuration.

Scale and traffic spikes. Need to generate 200 images for a launch? The Replicate API handles parallelism for you. Locally, you hit your VRAM ceiling quickly.

Flux Pro quality. Black Forest Labs' Flux 1.1 Pro model is not available locally (commercial rights): it runs exclusively via Replicate. Quality, especially prompt adherence on complex descriptions, exceeds Flux Dev or Schnell.

No GPU available. On an M1 Mac, a VPS, or a standard dev machine: the cloud API is the only viable option for high-resolution generation.

CriteriaLocal (ComfyUI)Cloud (Replicate)
Fixed costGPU requiredNone
Variable costElectricity$0.04/image (Flux Pro)
Latency5-20s (depends on GPU)4-10s
ParallelismLimited by VRAMUnlimited
Premium modelsNoYes (Flux Pro)
CI/CDDifficultNative
Data controlFullPer Replicate ToS

The full local vs cloud trade-off (volume thresholds, total cost over 1000 images) will be covered in a dedicated comparison coming soon.

Agent architecture

The flow works like this: the user sends a natural language message ("Generate a hero image for my coffee article"). Claude analyzes the request, builds an optimized prompt, then calls generate_image. Replicate returns a temporary URL. Claude calls optimize_webp to convert and compress. Then upload_storage to persist the image. Claude finally returns the public URL to the user.

This three-tool pattern is intentionally simple. You can extend it with a fourth generate_alt_text tool (Claude Vision) or resize_variants to generate multiple formats in parallel.

Why let Claude decide the tool order? The agent can sometimes choose to call optimize_webp before generate_image has finished if you pass it an existing URL. This flexibility is the core of the agentic pattern: Claude adapts the sequence to context, rather than following a fixed script.

Step-by-step implementation

The agent is built in four identical steps regardless of language. We start with the Python version, then the equivalent TypeScript version further down. Pick the one that matches your stack, both produce the same behavior.

Python version

1

Install dependencies

pip install anthropic replicate pillow boto3

Set your environment variables:

export ANTHROPIC_API_KEY="sk-ant-..."
export REPLICATE_API_TOKEN="r8_..."
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_S3_BUCKET="my-bucket"
2

Define the three tools

# tools.py
# source: docs.anthropic.com/en/agents-and-tools/tool-use/define-tools, consulted 2026-05-11
TOOLS = [
{
"name": "generate_image",
"description": (
"Generates an image from a text prompt via Flux 1.1 Pro on Replicate. "
"Returns a temporary URL valid for 1 hour."
),
"input_schema": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "Detailed description of the image to generate.",
},
"aspect_ratio": {
"type": "string",
"enum": ["1:1", "16:9", "3:2", "4:5", "9:16"],
"description": "Image ratio. Default: 16:9 for blog articles.",
},
},
"required": ["prompt"],
},
},
{
"name": "optimize_webp",
"description": (
"Downloads an image from a URL and converts it to optimized WebP. "
"Returns the local path of the WebP file."
),
"input_schema": {
"type": "object",
"properties": {
"image_url": {
"type": "string",
"description": "URL of the image to download and convert.",
},
"quality": {
"type": "integer",
"description": "WebP quality from 1 to 100. Default: 82.",
"minimum": 1,
"maximum": 100,
},
"filename": {
"type": "string",
"description": "Output filename without extension.",
},
},
"required": ["image_url", "filename"],
},
},
{
"name": "upload_storage",
"description": (
"Uploads a local file to S3 and returns the permanent public URL."
),
"input_schema": {
"type": "object",
"properties": {
"local_path": {
"type": "string",
"description": "Local path of the file to upload.",
},
"s3_key": {
"type": "string",
"description": "Destination key in the S3 bucket (e.g. images/hero-coffee.webp).",
},
},
"required": ["local_path", "s3_key"],
},
},
]
3

Implement tool functions

# tool_functions.py
import io
import os
import urllib.request
import boto3
import replicate
from PIL import Image
def generate_image(prompt: str, aspect_ratio: str = "16:9") -> str:
"""Calls Replicate Flux 1.1 Pro and returns the image URL."""
# source: replicate.com/black-forest-labs/flux-1.1-pro/api, consulted 2026-05-11
output = replicate.run(
"black-forest-labs/flux-1.1-pro",
input={
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"output_format": "jpg",
"output_quality": 90,
"safety_tolerance": 2,
},
)
return str(output)
def optimize_webp(image_url: str, filename: str, quality: int = 82) -> str:
"""Downloads and converts to optimized WebP."""
local_path = f"/tmp/{filename}.webp"
with urllib.request.urlopen(image_url) as response:
img_data = response.read()
img = Image.open(io.BytesIO(img_data))
img.save(local_path, "WEBP", quality=quality, method=6)
return local_path
def upload_storage(local_path: str, s3_key: str) -> str:
"""Uploads to S3 and returns the public URL."""
bucket = os.environ["AWS_S3_BUCKET"]
s3 = boto3.client("s3")
s3.upload_file(
local_path,
bucket,
s3_key,
ExtraArgs={"ContentType": "image/webp"},
)
return f"https://{bucket}.s3.amazonaws.com/{s3_key}"
4

Agent execution loop

# agent.py
import anthropic
from tool_functions import generate_image, optimize_webp, upload_storage
from tools import TOOLS
SYSTEM_PROMPT = """You are an agent specialized in image generation and publishing.
When the user requests an image:
1. Call generate_image with a precise, detailed prompt.
2. Call optimize_webp to convert the result to WebP.
3. Call upload_storage to publish the image.
4. Return the final URL along with a short description of the generated image.
Use aspect_ratio 16:9 by default for blog images."""
TOOL_FUNCTIONS = {
"generate_image": generate_image,
"optimize_webp": optimize_webp,
"upload_storage": upload_storage,
}
def run_agent(user_message: str) -> str:
# source: docs.anthropic.com/en/agents-and-tools/tool-use/handle-tool-calls, consulted 2026-05-11
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-5",
max_tokens=1024,
system=SYSTEM_PROMPT,
tools=TOOLS,
messages=messages,
)
# Append assistant response to history
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
for block in response.content:
if hasattr(block, "text"):
return block.text
return ""
if response.stop_reason != "tool_use":
break
# Execute requested tools
tool_results = []
for block in response.content:
if block.type != "tool_use":
continue
tool_fn = TOOL_FUNCTIONS.get(block.name)
if tool_fn is None:
result_content = f"Unknown tool: {block.name}"
is_error = True
else:
try:
result = tool_fn(**block.input)
result_content = result
is_error = False
except Exception as exc:
result_content = str(exc)
is_error = True
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result_content,
"is_error": is_error,
})
# Send results back to Claude
# Note: tool_result must come first in the user message content
messages.append({"role": "user", "content": tool_results})
return "The agent could not produce a result."
if __name__ == "__main__":
result = run_agent(
"Generate a hero image for a blog article about the benefits of morning coffee. "
"Photorealistic style, 16:9 format. Publish it as images/hero-coffee.webp"
)
print(result)

TypeScript version

1

Install dependencies

npm install @anthropic-ai/sdk replicate sharp @aws-sdk/client-s3
npm install -D tsx @types/node

Set your environment variables:

export ANTHROPIC_API_KEY="sk-ant-..."
export REPLICATE_API_TOKEN="r8_..."
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_S3_BUCKET="my-bucket"
2

Define the three tools

// tools.ts
// source: docs.anthropic.com/en/agents-and-tools/tool-use/define-tools, consulted 2026-05-11
import Anthropic from "@anthropic-ai/sdk";
export const TOOLS: Anthropic.Tool[] = [
{
name: "generate_image",
description:
"Generates an image from a text prompt via Flux 1.1 Pro on Replicate. " +
"Returns a temporary URL valid for 1 hour.",
input_schema: {
type: "object",
properties: {
prompt: {
type: "string",
description: "Detailed description of the image to generate.",
},
aspect_ratio: {
type: "string",
enum: ["1:1", "16:9", "3:2", "4:5", "9:16"],
description: "Image ratio. Default: 16:9 for blog articles.",
},
},
required: ["prompt"],
},
},
{
name: "optimize_webp",
description:
"Downloads an image from a URL and converts it to optimized WebP. " +
"Returns the local path of the WebP file.",
input_schema: {
type: "object",
properties: {
image_url: {
type: "string",
description: "URL of the image to download and convert.",
},
quality: {
type: "number",
description: "WebP quality from 1 to 100. Default: 82.",
minimum: 1,
maximum: 100,
},
filename: {
type: "string",
description: "Output filename without extension.",
},
},
required: ["image_url", "filename"],
},
},
{
name: "upload_storage",
description:
"Uploads a local file to S3 and returns the permanent public URL.",
input_schema: {
type: "object",
properties: {
local_path: {
type: "string",
description: "Local path of the file to upload.",
},
s3_key: {
type: "string",
description:
"Destination key in the S3 bucket (e.g. images/hero-coffee.webp).",
},
},
required: ["local_path", "s3_key"],
},
},
];
3

Implement tool functions

// tool-functions.ts
import { tmpdir } from "node:os";
import { join } from "node:path";
import { PutObjectCommand, S3Client } from "@aws-sdk/client-s3";
import sharp from "sharp";
import Replicate from "replicate";
const replicate = new Replicate();
const s3 = new S3Client({});
export async function generateImage(
prompt: string,
aspectRatio = "16:9"
): Promise<string> {
// source: replicate.com/black-forest-labs/flux-1.1-pro/api, consulted 2026-05-11
const output = await replicate.run("black-forest-labs/flux-1.1-pro", {
input: {
prompt,
aspect_ratio: aspectRatio,
output_format: "jpg",
output_quality: 90,
safety_tolerance: 2,
},
});
return String(output);
}
export async function optimizeWebp(
imageUrl: string,
filename: string,
quality = 82
): Promise<string> {
const response = await fetch(imageUrl);
const buffer = Buffer.from(await response.arrayBuffer());
const localPath = join(tmpdir(), `${filename}.webp`);
await sharp(buffer).webp({ quality }).toFile(localPath);
return localPath;
}
export async function uploadStorage(
localPath: string,
s3Key: string
): Promise<string> {
const { readFile } = await import("node:fs/promises");
const body = await readFile(localPath);
const bucket = process.env["AWS_S3_BUCKET"] ?? "";
await s3.send(
new PutObjectCommand({
Bucket: bucket,
Key: s3Key,
Body: body,
ContentType: "image/webp",
})
);
return `https://${bucket}.s3.amazonaws.com/${s3Key}`;
}
4

Agent execution loop

// agent.ts
// source: docs.anthropic.com/en/agents-and-tools/tool-use/handle-tool-calls, consulted 2026-05-11
import Anthropic from "@anthropic-ai/sdk";
import { TOOLS } from "./tools";
import { generateImage, optimizeWebp, uploadStorage } from "./tool-functions";
const SYSTEM_PROMPT = `You are an agent specialized in image generation and publishing.
When the user requests an image:
1. Call generate_image with a precise, detailed prompt.
2. Call optimize_webp to convert the result to WebP.
3. Call upload_storage to publish the image.
4. Return the final URL along with a short description of the generated image.
Use aspect_ratio 16:9 by default for blog images.`;
type ToolInput = Record<string, unknown>;
async function executeTool(name: string, input: ToolInput): Promise<string> {
switch (name) {
case "generate_image":
return generateImage(
input["prompt"] as string,
(input["aspect_ratio"] as string) ?? "16:9"
);
case "optimize_webp":
return optimizeWebp(
input["image_url"] as string,
input["filename"] as string,
(input["quality"] as number) ?? 82
);
case "upload_storage":
return uploadStorage(
input["local_path"] as string,
input["s3_key"] as string
);
default:
throw new Error(`Unknown tool: ${name}`);
}
}
export async function runAgent(userMessage: string): Promise<string> {
const client = new Anthropic();
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userMessage },
];
while (true) {
const response = await client.messages.create({
model: "claude-sonnet-5",
max_tokens: 1024,
system: SYSTEM_PROMPT,
tools: TOOLS,
messages,
});
messages.push({ role: "assistant", content: response.content });
if (response.stop_reason === "end_turn") {
const textBlock = response.content.find((b) => b.type === "text");
return textBlock && "text" in textBlock ? textBlock.text : "";
}
if (response.stop_reason !== "tool_use") break;
// Execute all requested tools and collect results
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type !== "tool_use") continue;
let content: string;
let isError = false;
try {
content = await executeTool(block.name, block.input as ToolInput);
} catch (err) {
content = String(err);
isError = true;
}
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content,
is_error: isError,
});
}
// tool_result must come first in the user message
messages.push({ role: "user", content: toolResults });
}
return "The agent could not produce a result.";
}
// Entry point
const result = await runAgent(
"Generate a hero image for a blog article about the benefits of morning coffee. " +
"Photorealistic style, 16:9 format. Publish it as images/hero-coffee.webp"
);
console.log(result);

Run with:

npx tsx agent.ts

Error handling

Two types of errors dominate in practice: rate limits and content filters.

Anthropic rate limits (HTTP 429). The Claude API imposes per-minute and per-day limits depending on your tier. An exponential backoff handles most cases:

import time
import anthropic
def call_with_retry(client, max_retries=5, **kwargs):
for attempt in range(max_retries):
try:
return client.messages.create(**kwargs)
except anthropic.RateLimitError:
if attempt == max_retries - 1:
raise
wait = 2 ** attempt # 1s, 2s, 4s, 8s, 16s
time.sleep(wait)

The same pattern in TypeScript:

async function callWithRetry(
client: Anthropic,
params: Anthropic.MessageCreateParamsNonStreaming,
maxRetries = 5
): Promise<Anthropic.Message> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await client.messages.create(params);
} catch (err) {
if (!(err instanceof Anthropic.RateLimitError)) throw err;
if (attempt === maxRetries - 1) throw err;
await new Promise((r) => setTimeout(r, 1000 * 2 ** attempt));
}
}
throw new Error("unreachable");
}

Replicate NSFW filter. If your prompt triggers Flux's safety filter, Replicate returns an error with an explicit message. The recommended strategy: ask Claude to rephrase the prompt, then call generate_image again. You can also lower safety_tolerance from 2 to 1 for a stricter mode.

Fallback from Flux Pro to Flux Dev. If Flux 1.1 Pro is overloaded (rare but possible), automatically switch to Flux Dev ($0.025/image):

import replicate
def generate_image_with_fallback(prompt: str, aspect_ratio: str = "16:9") -> str:
models = [
"black-forest-labs/flux-1.1-pro",
"black-forest-labs/flux-dev",
]
for model in models:
try:
output = replicate.run(model, input={"prompt": prompt, "aspect_ratio": aspect_ratio})
return str(output)
except replicate.exceptions.ReplicateError as e:
if "rate" in str(e).lower() and model != models[-1]:
continue
raise

Cloud timeout. Flux Pro generations typically take 4 to 10 seconds. Set a client-side timeout of 60 seconds to absorb load spikes. The timeout is configured at client construction, not per call:

import os
import replicate
# Client-side timeout: 60 seconds for all requests
client = replicate.Client(
api_token=os.environ["REPLICATE_API_TOKEN"],
timeout=60.0,
)
# Use client.run(...) instead of replicate.run(...)
output = client.run("black-forest-labs/flux-1.1-pro", input={"prompt": "..."})

Real execution cost

Pricing breakdown (as of 2026-06-30):

ComponentPrice
Claude Sonnet 5 (input)$3 / million tokens
Claude Sonnet 5 (output)$15 / million tokens
Flux 1.1 Pro$0.04 / image
Flux Dev$0.025 / image
Flux Schnell$0.003 / image

Concrete example: 1 blog article with 1 hero image.

A typical agent call for a blog image consumes:

  • Claude input tokens: ~800 tokens (system prompt + user message + tool definitions)
  • Claude output tokens: ~300 tokens (text + tool calls)
  • 1 Flux 1.1 Pro image

Calculation:

  • Claude input: 800 tokens × $3 / 1,000,000 = $0.0024
  • Claude output: 300 tokens × $15 / 1,000,000 = $0.0045
  • Flux Pro image: $0.04
  • Total: ~$0.047 per published image

At 100 images per month: roughly $4.70, about the price of a coffee.

Reducing costs:

Next steps

The agent you just built is a solid foundation. You can integrate it into broader workflows:

  • Claude Code + generative AI overview: the four integration patterns compared, to place this agent among the full range of approaches.
  • The agent SDK in depth: go further on the execution loop, streaming and state management of a Claude agent.
  • A local vs cloud comparison (volume thresholds, total cost) and an automated Next.js blog asset pipeline will round out this series soon.