Image Generation APIs: DALL-E 3 vs Stable Diffusion vs Midjourney in 2026
TL;DR
For simple programmatic image generation: DALL-E 3 (OpenAI) — best prompt adherence, easiest API, no fine-tuning needed. For fine-tuned custom models and lower cost at volume: fal.ai or Replicate running Stable Diffusion XL or FLUX. For highest quality creative work: Midjourney (no real API, only Discord bot). In 2026, FLUX from Black Forest Labs has largely displaced Stable Diffusion as the best open model — and fal.ai runs FLUX faster and cheaper than Replicate. Choose based on whether you need quality, control, or cost.
Key Takeaways
- DALL-E 3: $0.04-$0.12/image, best prompt adherence, no fine-tuning, OpenAI ecosystem
- DALL-E 2: $0.016-$0.02/image, cheaper, supports inpainting/editing
- fal.ai: $0.003-$0.06/image running FLUX/SDXL, GPU-native, fast (~3-5s)
- Replicate: similar to fal.ai but simpler pricing, 1,000s of community models
- Midjourney: no API (Discord only), best quality for artistic use
- FLUX.1: best open image model in 2026, replaces SD for most use cases
DALL-E 3: Best Prompt Adherence
Best for: apps that generate images from user text, product mockups, social media content
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Generate a single image:
const response = await openai.images.generate({
model: 'dall-e-3',
prompt: 'A cozy coffee shop interior with warm lighting, wooden furniture, and plants',
size: '1024x1024', // '1024x1024', '1024x1792', '1792x1024'
quality: 'standard', // 'standard' or 'hd' (2x cost, more detail)
style: 'vivid', // 'vivid' (dramatic) or 'natural' (realistic)
n: 1, // DALL-E 3 only supports n=1
});
const imageUrl = response.data[0].url;
// URL is temporary (~1 hour) — download immediately
console.log(imageUrl);
// Download and save the image:
const imageResponse = await fetch(imageUrl);
const imageBuffer = Buffer.from(await imageResponse.arrayBuffer());
fs.writeFileSync('output.png', imageBuffer);
// Or return as base64:
const base64Response = await openai.images.generate({
model: 'dall-e-3',
prompt: 'A modern tech startup office',
size: '1024x1024',
response_format: 'b64_json', // Return base64 instead of URL
});
const base64 = base64Response.data[0].b64_json!;
// Save: Buffer.from(base64, 'base64')
// Or use in HTML: `data:image/png;base64,${base64}`
// DALL-E 3's killer feature: it automatically improves prompts
// The revised_prompt field shows what DALL-E actually used:
const response = await openai.images.generate({
model: 'dall-e-3',
prompt: 'coffee shop',
size: '1024x1024',
});
console.log('Original prompt:', 'coffee shop');
console.log('Revised prompt:', response.data[0].revised_prompt);
// "A warm and inviting coffee shop with exposed brick walls, soft ambient lighting,
// wooden tables, cozy armchairs, and steam rising from coffee cups..."
DALL-E 3 Pricing
Size | Standard | HD
1024×1024 | $0.040 | $0.080
1024×1792 | $0.080 | $0.120
1792×1024 | $0.080 | $0.120
DALL-E 2: Cheaper + Editing
DALL-E 2 is older and lower quality than DALL-E 3, but supports editing (inpainting) — replacing parts of an image with generated content.
// DALL-E 2 editing — change part of an image:
import fs from 'fs';
const editedImage = await openai.images.edit({
model: 'dall-e-2',
image: fs.createReadStream('original.png'), // Source image (RGBA PNG)
mask: fs.createReadStream('mask.png'), // White pixels = areas to replace
prompt: 'A golden retriever sitting in the chair',
size: '1024x1024',
n: 1,
});
// mask.png: white areas will be regenerated, black areas kept
// DALL-E 2 variations — generate similar images:
const variations = await openai.images.createVariation({
model: 'dall-e-2',
image: fs.createReadStream('source.png'),
n: 4, // Generate 4 variations at once
size: '512x512', // 256x256, 512x512, or 1024x1024
});
// Returns 4 different versions of the source image
fal.ai: FLUX at Scale
fal.ai is a GPU inference platform — they run popular image models (FLUX, SDXL) faster and cheaper than Replicate.
// npm install @fal-ai/client
import { fal } from '@fal-ai/client';
fal.config({ credentials: process.env.FAL_KEY });
// FLUX.1 Schnell (fastest, good quality):
const result = await fal.subscribe('fal-ai/flux/schnell', {
input: {
prompt: 'A photorealistic image of a futuristic city at night, neon lights, rain',
image_size: 'landscape_16_9', // 'square', 'portrait_4_3', 'landscape_16_9', etc.
num_images: 1,
num_inference_steps: 4, // Schnell uses only 4 steps (very fast)
enable_safety_checker: true,
},
logs: true,
onQueueUpdate: (update) => {
if (update.status === 'IN_PROGRESS') {
console.log('Progress:', update.logs?.map((l) => l.message).join(', '));
}
},
});
console.log('Image URL:', result.data.images[0].url);
// FLUX.1 Dev (higher quality, slower):
const devResult = await fal.subscribe('fal-ai/flux/dev', {
input: {
prompt: 'Professional portrait photo, studio lighting, bokeh background',
image_size: { width: 1024, height: 1024 },
num_inference_steps: 28, // More steps = better quality
guidance_scale: 3.5,
num_images: 1,
seed: 42, // Reproducible output
},
});
// Fine-tuned models on fal.ai (LoRA):
const finetunedResult = await fal.subscribe('fal-ai/flux-lora', {
input: {
prompt: 'photo of a PRODUCT in a minimalist setting, white background',
loras: [
{
path: 'your-trained-lora-url', // Upload your LoRA weights
scale: 1.0,
},
],
num_images: 1,
},
});
fal.ai Pricing
FLUX.1 Schnell: ~$0.003/image (4 steps, fast)
FLUX.1 Dev: ~$0.025/image (28 steps, high quality)
SDXL: ~$0.005/image
Stable Diffusion 3: ~$0.035/image
Compare to DALL-E 3: $0.040/image
fal.ai FLUX Dev is ~40% cheaper than DALL-E 3 with comparable quality
Replicate: 1,000s of Models
Replicate runs almost any open-source image model, including fine-tuned community models.
// npm install replicate
import Replicate from 'replicate';
const replicate = new Replicate({ auth: process.env.REPLICATE_API_TOKEN });
// Run a model:
const output = await replicate.run(
'black-forest-labs/flux-schnell',
{
input: {
prompt: 'a cat wearing a top hat, digital art, highly detailed',
aspect_ratio: '1:1',
output_format: 'webp',
output_quality: 90,
num_outputs: 1,
},
}
);
// output is an array of URLs for the generated images
const imageUrl = (output as string[])[0];
// Stream progress updates:
for await (const event of replicate.stream('black-forest-labs/flux-dev', {
input: { prompt: 'photorealistic landscape', num_inference_steps: 28 },
})) {
console.log(event);
}
Replicate Pricing
Billing: per-second of GPU compute
FLUX Schnell: ~$0.003-0.005/image
FLUX Dev: ~$0.020-0.030/image
SDXL: ~$0.005-0.010/image
Replicate also offers deployments (always-on) for consistent latency:
Serverless: cold starts (~5-10s first call), then fast
Deployment: always warm, consistent ~3-5s
Stability AI: Direct API
Stability AI (Stable Diffusion creators) has their own API:
// Stability AI REST API:
const response = await fetch('https://api.stability.ai/v2beta/stable-image/generate/core', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
Accept: 'image/*',
},
body: (() => {
const form = new FormData();
form.append('prompt', 'A stunning mountain landscape at golden hour');
form.append('negative_prompt', 'blurry, low quality, watermark');
form.append('aspect_ratio', '16:9');
form.append('output_format', 'png');
return form;
})(),
});
const imageBuffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync('output.png', imageBuffer);
Model Comparison for Common Use Cases
Use case: Product photography mockup
→ Best: DALL-E 3 (prompt adherence) or fine-tuned FLUX on fal.ai
→ Code: openai.images.generate with detailed product description
Use case: Avatar or portrait generation
→ Best: FLUX Dev (fal.ai) or SDXL with LoRA fine-tune
→ Code: fal.subscribe('fal-ai/flux/dev', {...})
Use case: Social media content at scale
→ Best: FLUX Schnell on fal.ai ($0.003/image, 3-5s generation)
→ Code: batch requests to fal.ai for high throughput
Use case: Logo or icon generation
→ Best: DALL-E 3 (better at following specific brand guidelines)
→ Code: openai.images.generate with style: 'natural'
Use case: Edit existing images (inpainting)
→ Best: DALL-E 2 (only major provider with inpainting API)
→ Code: openai.images.edit with mask
Use case: Consistent character across many images
→ Best: Custom LoRA on fal.ai or Replicate
→ Train: train a LoRA on 20-30 images of your character
→ Code: fal.subscribe('fal-ai/flux-lora', { loras: [...] })
Side-by-Side Comparison
| DALL-E 3 | FLUX via fal.ai | Replicate | Midjourney | |
|---|---|---|---|---|
| Cost/image | $0.04-0.12 | $0.003-0.025 | $0.005-0.03 | ~$0.01 (plan/month) |
| Latency | 8-15s | 3-8s | 5-15s | 30-60s |
| Fine-tuning | ❌ | ✅ LoRA | ✅ LoRA | ❌ |
| Inpainting | DALL-E 2 only | ✅ | ✅ | ❌ |
| Prompt adherence | ✅ Best | Good | Good | Great |
| API | ✅ REST | ✅ REST | ✅ REST | ❌ Discord only |
| NSFW option | ❌ | Optional | Optional | ❌ |
| SLA / enterprise | ✅ | Limited | Limited | ❌ |
Compare all image generation APIs at APIScout.