Building Multi-Region APIs 2026

Single-Region APIs Are Slow for Half Your Users

A single-region API hosted in US East 1 responds in 20ms for a user in New York. The same API responds in 220ms for a user in Singapore — a 10x latency difference caused entirely by the speed of light across 16,000 km of cable.

For applications where latency directly affects user experience (real-time collaboration, gaming, financial trading, interactive web apps), single-region deployment leaves performance on the table for users outside the hosting region. Multi-region deployment solves this by running your API close to where users are.

In 2026, three approaches define global API deployment: edge computing (Cloudflare Workers, Deno Deploy), multi-region cloud deployments (fly.io, Railway global regions, AWS multi-region), and globally distributed databases (Turso, PlanetScale, CockroachDB). Each addresses different parts of the latency problem.

TL;DR

Cloudflare Workers is the most accessible edge deployment — your code runs at 330+ global edge locations, user requests are served from the nearest PoP, and global TTFB is typically under 50ms. For database reads, Cloudflare D1 global read replication and Turso's edge-replicated SQLite eliminate database-induced latency globally. For writes, the challenge remains: writes must go to a primary region or use conflict resolution strategies. Multi-region databases (CockroachDB, PlanetScale) handle replication automatically at higher cost.

Key Takeaways

Cloudflare Workers runs at 330+ PoPs — user requests are served from the nearest edge location with Time to First Byte under 50ms for 95% of the world's population.
Cloudflare D1 global read replication copies your SQLite database to all regions — read queries execute locally at the edge, writes still go to the primary region.
Turso edge-replicates SQLite to 35+ regions — designed for edge computing, with per-database pricing ($29/month for 100 databases on Pro plan).
CockroachDB and PlanetScale provide globally distributed SQL — writes are accepted in any region and replicated globally, with configurable consistency guarantees.
The write problem: Read replication is relatively simple; globally-consistent writes require careful trade-offs between consistency (CAP theorem) and latency.
Fly.io runs Docker containers in 30+ regions — closer to traditional cloud deployment than serverless edge, with automatic regional routing and private networking between instances.
Smart Placement: Cloudflare Workers Smart Placement automatically places your Worker near your database to minimize compute-to-data latency, even if that's not the nearest PoP to the user.

The Latency Problem, Quantified

User Location	Single Region (US East)	Edge (Cloudflare Workers)
New York	20ms	8ms
London	80ms	18ms
Singapore	220ms	25ms
Sydney	280ms	30ms
São Paulo	150ms	20ms

Source: Based on published Cloudflare benchmark data and typical cloud provider latency profiles.

Approach 1: Edge Computing with Cloudflare Workers

Best for: Read-heavy APIs, cacheable responses, simple CRUD against edge-replicated databases

Cloudflare Workers runs JavaScript/TypeScript (and any language compiling to Wasm) at 330+ global edge locations. Requests are automatically routed to the nearest PoP — no DNS geo-routing configuration required.

Basic Global API (Workers)

// src/index.ts — Cloudflare Worker
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);

    // Check CF headers for user's location
    const userCountry = request.headers.get("CF-IPCountry") || "unknown";
    const userCity = request.cf?.city || "unknown";

    if (url.pathname === "/api/products") {
      // Query local edge database (D1 or KV)
      const products = await getProducts(env);

      return Response.json(products, {
        headers: {
          "X-Served-From": request.cf?.colo || "unknown",  // e.g., "SIN", "LHR", "EWR"
          "Cache-Control": "s-maxage=60",  // Cache at edge for 60 seconds
        },
      });
    }

    return new Response("Not Found", { status: 404 });
  },
};

async function getProducts(env: Env) {
  // Cloudflare D1 — SQLite at the edge
  const { results } = await env.DB.prepare(
    "SELECT id, name, price, description FROM products WHERE active = 1 LIMIT 100"
  ).all();
  return results;
}

D1 Global Read Replication

// Cloudflare D1 read replication (2026 feature)
// Enable in wrangler.toml:
// [[d1_databases]]
// binding = "DB"
// database_name = "products"
// database_id = "your-database-id"
// read_replication = { mode = "auto" }

// With read replication enabled:
// - SELECT queries execute against the nearest replica
// - INSERT/UPDATE/DELETE go to the primary region
// - 95.7% latency reduction for read-heavy APIs from non-primary regions

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { results } = await env.DB.prepare(
      // This SELECT runs locally at the edge — 5ms instead of 220ms
      "SELECT * FROM products WHERE id = ?1"
    ).bind(productId).all();

    return Response.json(results[0]);
  },
};

Workers KV for Global Caching

// Workers KV — globally distributed key-value, read in ~5ms anywhere
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const cacheKey = `product:${productId}`;

    // Check global KV cache first (replicates in ~60 seconds globally)
    const cached = await env.KV.get(cacheKey, "json");
    if (cached) {
      return Response.json(cached, {
        headers: { "X-Cache": "HIT", "X-Cache-Key": cacheKey },
      });
    }

    // Cache miss — fetch from origin database
    const product = await fetchFromDatabase(productId);

    // Store in KV — available globally within 60 seconds
    await env.KV.put(cacheKey, JSON.stringify(product), {
      expirationTtl: 3600,  // 1-hour TTL
    });

    return Response.json(product, { headers: { "X-Cache": "MISS" } });
  },
};

Smart Placement for Database-Bound Workers

# wrangler.toml — Smart Placement places Worker near your database
[placement]
mode = "smart"
# Cloudflare analyzes your Worker's traffic patterns and database connections
# Automatically places Worker in region closest to your database
# When database is in US East, Worker runs from US East PoP
# even if user is in Europe (but routes read replicas to EU PoPs)

Approach 2: Turso Edge SQLite

Best for: Read-heavy applications with edge-replicated SQLite, hobby projects to production scale

Turso is SQLite for the edge — a libSQL fork with global database replication. Each database can be replicated to 35+ regions; reads execute against the nearest replica; writes go to the primary replica.

Pricing

Plan	Cost	Databases
Starter	Free	500 databases, 9GB storage
Scaler	$29/month	Unlimited databases
Enterprise	Custom	Volume pricing

Integration with Cloudflare Workers

// Turso + Cloudflare Workers
import { createClient } from "@libsql/client/web";

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Turso client — automatically connects to nearest replica
    const db = createClient({
      url: env.TURSO_DATABASE_URL,   // Primary URL
      authToken: env.TURSO_AUTH_TOKEN,
    });

    // This read goes to nearest Turso replica — could be Singapore, Frankfurt, etc.
    const { rows } = await db.execute({
      sql: "SELECT * FROM products WHERE id = ?",
      args: [productId],
    });

    return Response.json(rows[0]);
  },
};

Database Replication Setup

# Create a database with global replication
turso db create products --location sjc  # Primary in San Jose

# Add read replicas in additional regions
turso db replicate products nrt  # Tokyo replica
turso db replicate products lhr  # London replica
turso db replicate products sao  # São Paulo replica
turso db replicate products syd  # Sydney replica

# List replicas
turso db show products

Approach 3: Multi-Region Cloud Deployments (fly.io)

Best for: Traditional Docker containers with global regional distribution

fly.io runs Docker containers in 30+ regions worldwide — closer to traditional cloud deployment than serverless edge, but with automatic global routing and private networking between instances in different regions.

Deployment Configuration

# fly.toml — deploy to multiple regions
app = "my-api"
primary_region = "iad"  # US East as primary

[[services]]
  protocol = "tcp"
  internal_port = 8080

  [[services.ports]]
    port = 443
    handlers = ["tls", "http"]

  [services.concurrency]
    type = "requests"
    hard_limit = 200

# Regions are set via:
# fly scale count 2 --region lhr  # 2 instances in London
# fly scale count 2 --region nrt  # 2 instances in Tokyo
# fly scale count 2 --region gru  # 2 instances in São Paulo

Fly.io Read Replicas with PlanetScale

import planetscale

# PlanetScale automatically routes reads to nearest replica
db = planetscale.Client(
    host=os.environ["DATABASE_HOST"],  # PlanetScale global endpoint
    username=os.environ["DATABASE_USER"],
    password=os.environ["DATABASE_PASSWORD"],
    database="production",
)

# This query is automatically served from nearest PlanetScale replica
# Writes go to primary via replication lag (<100ms between regions)
result = db.execute("SELECT * FROM products WHERE id = %s", [product_id])

Handling Writes Globally

The hardest problem in multi-region APIs is globally-consistent writes. Three patterns:

Pattern 1: Write to Primary, Read from Replicas

// Route writes to primary region, reads to nearest replica
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.method === "GET") {
      // Read from nearest replica — fast
      return handleRead(request, env);
    }

    if (request.method === "POST" || request.method === "PUT") {
      // Route write to primary region
      const primaryUrl = "https://api-primary.us-east.example.com";
      return fetch(new Request(primaryUrl + request.url, request));
    }
  },
};

Pattern 2: CockroachDB — Globally Distributed SQL

import psycopg2

# CockroachDB — accepts writes in any region, replicates globally
# Reads from nearest replica automatically
conn = psycopg2.connect(
    "postgresql://user:password@free-tier.cockroachlabs.cloud:26257/defaultdb?sslmode=verify-full"
)

cursor = conn.cursor()

# This INSERT is accepted by any CockroachDB node globally
# Replicates to all regions within ~150ms
cursor.execute(
    "INSERT INTO orders (id, user_id, amount, status) VALUES (%s, %s, %s, %s)",
    ["ord_123", "user_456", 99.99, "pending"]
)
conn.commit()

Pattern 3: Optimistic Local Writes with Conflict Resolution

// Cloudflare Durable Objects — coordinate global state
// Each Durable Object lives in a single region but accepts requests from any Worker
export class OrderDurableObject {
  constructor(private state: DurableObjectState, private env: Env) {}

  async fetch(request: Request): Promise<Response> {
    const { method } = request;

    if (method === "POST") {
      const order = await request.json();

      // Write goes to this Durable Object's region
      // Cloudflare routes requests to the same DO instance globally
      await this.state.storage.put(`order:${order.id}`, order);

      return Response.json({ success: true });
    }

    if (method === "GET") {
      const orderId = new URL(request.url).searchParams.get("id");
      const order = await this.state.storage.get(`order:${orderId}`);
      return Response.json(order);
    }
  }
}

Caching Strategy for Global APIs

// Cache-Control headers for global CDN caching
export function getCacheHeaders(resource: string): Headers {
  const headers = new Headers();

  switch (resource) {
    case "product-catalog":
      // Catalog changes rarely — cache for 1 hour at edge
      headers.set("Cache-Control", "public, s-maxage=3600");
      headers.set("CDN-Cache-Control", "max-age=3600");
      break;

    case "user-profile":
      // User data — don't cache at edge, cache in browser briefly
      headers.set("Cache-Control", "private, max-age=60");
      break;

    case "pricing":
      // Pricing changes occasionally — cache 5 minutes with stale-while-revalidate
      headers.set("Cache-Control", "public, s-maxage=300, stale-while-revalidate=600");
      break;
  }

  return headers;
}

When to Go Multi-Region

Multi-region deployment adds operational complexity, cost, and — critically — the write distribution problem. Before committing, verify that latency is actually your bottleneck.

Good candidates for multi-region:

B2C consumer products with users across 3+ continents where response latency directly impacts engagement (every 100ms increase in latency reduces conversions by ~1%)
Real-time applications: collaborative tools, multiplayer games, trading platforms, live dashboards where perceived responsiveness is a core product attribute
Global SaaS with data residency requirements: GDPR and similar regulations may require European user data to stay in the EU, pushing you toward regional deployment regardless of latency
APIs with highly cacheable responses: Product catalogs, content APIs, configuration endpoints — read replication makes these dramatically faster with minimal architectural change

Poor candidates:

Internal tools or admin dashboards with users in one region — don't add global complexity for a problem that doesn't exist
Write-heavy applications: If your API is mostly writes (user-generated content, logging, transactions), the write distribution problem costs more to solve than the latency gains are worth
Early-stage products: Focus on product-market fit before optimizing for global scale

Decision Framework

Scenario	Recommended
Read-heavy, globally cacheable	Cloudflare Workers + KV
Edge SQLite, moderate writes	Turso + Cloudflare Workers
Traditional containers, global	fly.io
Globally-distributed SQL writes	CockroachDB
MySQL-compatible, global reads	PlanetScale
Real-time state coordination	Cloudflare Durable Objects
Hybrid edge + origin	Cloudflare Workers + Hyperdrive

Cost Comparison

Multi-region deployment typically increases infrastructure cost. Here's a rough breakdown for a moderate-traffic API (10M requests/month):

Approach	Compute	Database	Total/month
Single region (t3.medium)	~$30	~$20	~$50
Cloudflare Workers	~$5 (paid plan + overages)	KV: ~$10	~$15
Turso + Workers	~$5	Turso Scaler: $29	~$34
fly.io (3 regions, 2 instances each)	~$120	depends	~$140+
CockroachDB + fly.io	~$120	CockroachDB Dedicated: ~$300+	~$420+

The counter-intuitive result: Cloudflare Workers is often cheaper than a single-region VM for read-heavy APIs, because you only pay for actual requests and the free tier covers 100K requests/day. The expensive option is always global writes with CockroachDB or PlanetScale Distributed, which justifies its cost only when write distribution is a hard requirement.

Verdict

Cloudflare Workers is the most accessible entry point for global API performance — 330+ PoPs, free tier (100K requests/day), and the fastest path from single-region to global edge.

Turso solves the database problem for edge applications — SQLite replicated to 35+ regions with automatic routing to the nearest replica. The combination of Cloudflare Workers + Turso is the highest-performance edge stack available in 2026.

Fly.io is the right choice for teams that want global deployment but need Docker containers (existing application code, specific runtimes, stateful workloads not suited for serverless).

CockroachDB and PlanetScale handle the write distribution problem — when you need globally-distributed SQL writes with automatic replication, these managed databases remove the infrastructure complexity.

Observability for Multi-Region APIs

Distributed systems require distributed tracing. When a request might be served from Singapore, routed to a US primary for a write, and trigger a webhook to a London consumer endpoint, standard single-region logging misses most of the picture.

Cloudflare Workers integrates with Workers Trace Events (via Logpush) and has a built-in wrangler tail command for real-time log streaming. For distributed tracing across regions, OpenTelemetry with a backend like Honeycomb, Datadog, or Grafana Tempo provides end-to-end visibility. Pass a traceparent header (W3C Trace Context standard) through every service boundary to connect a single request's spans across edge, origin, and database calls.

Key metrics to monitor for multi-region APIs:

P50/P95/P99 latency by region — verify that edge deployment is actually helping; check both cold and warm path
Cache hit rate — for edge-cached responses, a low cache hit rate means users still experience origin latency
Replication lag — for read replicas, monitor how stale edge reads are relative to primary writes
Error rate by region — a spike in a single region's errors could indicate a regional database or network issue
Write routing latency — if writes route back to a primary region, monitor the additional round-trip latency added for users far from the primary

Compare global API deployment options, edge database pricing, and latency benchmarks at APIScout — find the right multi-region architecture for your API.

The API Integration Checklist (Free PDF)