Define the API contract in a .proto file: Generate code in any language:

REST vs GraphQL vs gRPC: API Protocol Guide 2026

TL;DR

Three API protocols, three distinct trade-offs. REST is the universal default — works everywhere, caches by default, and every developer already knows it. GraphQL solves over/under-fetching for complex client-driven queries but adds caching complexity. gRPC delivers maximum performance for internal microservice communication but can't run in a browser without a proxy. Start with REST, layer in GraphQL for specific frontend needs, reserve gRPC for internal service-to-service calls where latency matters.

Key Takeaways

REST handles 80% of API use cases with the least overhead — it's the right default
GraphQL cuts payload sizes by 40–60% for clients with complex nested data requirements
gRPC is 5–10x faster than REST for internal service communication due to binary Protocol Buffer serialization
Mixing protocols is normal: public REST API + internal gRPC microservices is a common production pattern
Browser support rules out raw gRPC for public APIs unless you add gRPC-Web or Connect RPC
OpenAPI (for REST) and proto files (for gRPC) serve as contracts — prioritize schema-first design in all three

Why Protocol Choice Matters More Than Ever

In 2026, most teams are running a mix: a public REST API for third-party developers, a GraphQL layer for their own frontend, and gRPC between internal microservices. Understanding each protocol's trade-offs lets you make deliberate choices rather than defaulting to the first option you learned.

The choice isn't just performance — it's developer experience, client compatibility, caching behavior, and operational tooling. A protocol decision at the start of a project is hard to undo at scale.

REST — The Universal Standard

REST (Representational State Transfer) maps HTTP methods to resource operations. It's been the dominant API style since Roy Fielding's 2000 dissertation, and that dominance isn't accidental — REST aligns with how HTTP works, enabling caching, CDN distribution, and standard tooling out of the box.

How REST Works

# Resource-oriented endpoints, standard HTTP verbs
GET    /apis/stripe              # Fetch one API
GET    /apis?category=payments   # List with filter
POST   /apis                     # Create
PUT    /apis/stripe              # Replace
PATCH  /apis/stripe              # Partial update
DELETE /apis/stripe              # Delete

// Express REST API example
import express from 'express';
const app = express();

// GET /apis/:slug — fetch single API with caching headers
app.get('/apis/:slug', async (req, res) => {
  const api = await db.apis.findOne({ slug: req.params.slug });
  if (!api) return res.status(404).json({ error: 'API not found' });

  // HTTP caching — REST's killer feature
  res.set('Cache-Control', 'public, max-age=300'); // 5-min CDN cache
  res.set('ETag', `"${api.updatedAt.getTime()}"`);
  res.json(api);
});

// POST /apis — create with validation
app.post('/apis', authenticate, async (req, res) => {
  const { name, url, category } = req.body;
  if (!name || !url) {
    return res.status(400).json({
      error: 'Missing required fields',
      required: ['name', 'url']
    });
  }
  const api = await db.apis.create({ name, url, category });
  res.status(201).json(api);
});

REST Caching — The Underrated Advantage

REST's biggest practical advantage is HTTP caching. Every GET request can be cached by browsers, CDNs, and reverse proxies automatically. A Cloudflare CDN can serve a popular API's data from the edge in milliseconds with zero load on your origin server.

# HTTP cache headers in REST — free performance
Cache-Control: public, max-age=3600    # CDN + browser cache 1 hour
ETag: "abc123"                          # Conditional requests
Last-Modified: Tue, 18 Mar 2026 10:00:00 GMT

# Client conditional request (only fetch if changed)
GET /apis/stripe
If-None-Match: "abc123"
# Returns 304 Not Modified if unchanged — zero data transfer

REST Weaknesses in Practice

Over-fetching is the most common complaint. A mobile app showing a list of API names and prices gets the full API object — including documentation, changelog history, and tag arrays — even though it only needs 3 fields:

// Client needs: name + price + uptime
// REST returns the full object: 40+ fields
{
  "id": "api_123",
  "name": "Stripe",
  "slug": "stripe",
  "category": "payments",
  "price": "$0.029/transaction",
  "uptime": 99.99,
  "documentation_url": "...",
  "changelog": [...],
  "tags": [...],
  "ratings": {...},
  // 30+ more fields the client doesn't need
}

Under-fetching requires multiple round trips. Fetching an API with its latest reviews requires separate requests:

GET /apis/stripe          # 1st request: get API
GET /apis/stripe/reviews  # 2nd request: get reviews
GET /users/456            # 3rd request: get review author

GraphQL — The Client-Driven Query Language

GraphQL inverts the data-fetching model. Instead of the server defining what each endpoint returns, clients specify exactly what they need in a query. Facebook built GraphQL to solve over/under-fetching across their many mobile clients with different data requirements.

How GraphQL Works

# Client requests exactly what it needs
query GetApiWithReviews {
  api(slug: "stripe") {
    name
    pricing {
      tier
      price
    }
    uptime
    reviews(first: 5) {
      rating
      comment
      author {
        name
      }
    }
  }
}

// GraphQL server with Apollo Server
import { ApolloServer, gql } from '@apollo/server';

const typeDefs = gql`
  type Query {
    api(slug: String!): API
    apis(category: String, limit: Int): [API!]!
  }

  type API {
    id: ID!
    name: String!
    slug: String!
    uptime: Float
    pricing: Pricing
    reviews(first: Int): [Review!]!
  }

  type Pricing {
    tier: String!
    price: String!
    freeTrialDays: Int
  }

  type Review {
    rating: Int!
    comment: String
    author: User!
  }
`;

const resolvers = {
  Query: {
    api: (_, { slug }) => db.apis.findOne({ slug }),
    apis: (_, { category, limit = 20 }) =>
      db.apis.findAll({ where: { category }, limit }),
  },
  API: {
    reviews: (api, { first = 10 }) =>
      db.reviews.findAll({ where: { apiId: api.id }, limit: first }),
  },
};

GraphQL Subscriptions for Real-Time

GraphQL's subscription type adds real-time capabilities over WebSockets:

subscription OnApiStatusChange {
  apiStatusChanged(slug: "stripe") {
    status
    uptime
    latency
    updatedAt
  }
}

// Client subscription with Apollo Client
import { useSubscription } from '@apollo/client';

function ApiStatus({ slug }) {
  const { data } = useSubscription(ON_API_STATUS_CHANGE, {
    variables: { slug }
  });

  return <StatusBadge uptime={data?.apiStatusChanged.uptime} />;
}

GraphQL Caching — The Real Challenge

HTTP caching is GraphQL's main operational headache. All queries go to a single POST /graphql endpoint — CDNs don't cache POST requests by default.

// Persisted queries work around GraphQL caching limitations
// Client sends a hash of the query instead of the full query string
// Server maps hash → query → cached result

// Apollo Client: automatic persisted queries
const client = new ApolloClient({
  link: createPersistedQueryLink().concat(httpLink),
});

// Server-side caching per query hash
const cache = new KeyValueCache();
const server = new ApolloServer({
  plugins: [
    ApolloServerPluginCacheControl({ defaultMaxAge: 300 }),
  ],
});

gRPC — Maximum Performance for Internal APIs

gRPC uses Protocol Buffers (protobufs) for serialization — a binary format that's 3–10x smaller than JSON and 5–10x faster to serialize/deserialize. Google designed gRPC for internal service communication where performance at scale is non-negotiable.

How gRPC Works

Define the API contract in a .proto file:

// api.proto
syntax = "proto3";

package apiscout;

service APIService {
  rpc GetAPI (GetAPIRequest) returns (API);
  rpc ListAPIs (ListAPIsRequest) returns (ListAPIsResponse);
  rpc StreamAPIUpdates (StreamRequest) returns (stream APIUpdate);
  rpc BatchGetAPIs (stream BatchRequest) returns (stream API);
}

message GetAPIRequest {
  string slug = 1;
}

message API {
  string id = 1;
  string name = 2;
  string slug = 3;
  double uptime = 4;
  repeated string tags = 5;
  Pricing pricing = 6;
}

message Pricing {
  string tier = 1;
  string price = 2;
}

Generate code in any language:

# Generate TypeScript server + client stubs
npx ts-proto \
  --ts_proto_opt=outputServices=grpc-js \
  --plugin=node_modules/.bin/protoc-gen-ts_proto \
  api.proto

// gRPC server implementation (Node.js)
import * as grpc from '@grpc/grpc-js';
import { APIServiceService } from './generated/api_grpc_pb';

const server = new grpc.Server();

server.addService(APIServiceService, {
  async getAPI(call, callback) {
    const api = await db.apis.findOne({ slug: call.request.slug });
    if (!api) {
      return callback({
        code: grpc.status.NOT_FOUND,
        message: `API "${call.request.slug}" not found`,
      });
    }
    callback(null, api);
  },

  // Server-side streaming — push updates as they happen
  streamAPIUpdates(call) {
    const slug = call.request.slug;
    const unsubscribe = eventBus.on(`api.${slug}.updated`, (update) => {
      call.write(update);
    });
    call.on('cancelled', unsubscribe);
  },
});

server.bindAsync('0.0.0.0:50051', grpc.ServerCredentials.createInsecure(), () => {
  server.start();
});

// gRPC client (another microservice)
import * as grpc from '@grpc/grpc-js';
import { APIServiceClient } from './generated/api_grpc_pb';

const client = new APIServiceClient(
  'api-service:50051',
  grpc.credentials.createInsecure()
);

// Unary call — like a REST GET
client.getAPI({ slug: 'stripe' }, (err, response) => {
  console.log(response.name, response.uptime);
});

// Bidirectional streaming — real-time, both sides send
const stream = client.batchGetAPIs();
stream.on('data', (api) => console.log('Received:', api.name));
['stripe', 'twilio', 'sendgrid'].forEach(slug => {
  stream.write({ slug });
});
stream.end();

gRPC in the Browser: gRPC-Web and Connect RPC

Raw gRPC requires HTTP/2 trailers, which browsers don't support. For browser clients, use gRPC-Web or Connect RPC:

// Connect RPC — gRPC-compatible, works in browsers without a proxy
import { createClient } from "@connectrpc/connect";
import { createConnectTransport } from "@connectrpc/connect-web";
import { APIService } from "./generated/api_connect";

const transport = createConnectTransport({
  baseUrl: "https://api.apiscout.dev",
});

const client = createClient(APIService, transport);
const api = await client.getAPI({ slug: "stripe" });

Performance Benchmarks

Real-world performance comparison for a typical API response (API details + 5 reviews + pricing):

Metric	REST (JSON)	GraphQL	gRPC (Protobuf)
Payload size	4,200 bytes	1,800 bytes	380 bytes
Serialization	1.0x baseline	1.1x	0.12x
Deserialization	1.0x baseline	1.1x	0.09x
Latency (LAN)	~2ms	~3ms	~0.4ms
Latency (WAN)	~45ms	~42ms	~38ms
Requests/sec (single core)	18,000	14,000	95,000

Key takeaways from the benchmarks:

Payload size: gRPC is 11x smaller than REST due to binary Protobuf encoding
LAN latency: gRPC is 5x faster than REST — meaningful for microservices making hundreds of inter-service calls per user request
WAN latency: Differences shrink significantly over the network — REST vs GraphQL vs gRPC are within 15% on WAN
Serialization speed: Protobuf serialization is 8–10x faster than JSON, reducing CPU cost at scale

Full Comparison Table

Factor	REST	GraphQL	gRPC
Performance	Good	Good	Excellent
Payload efficiency	Medium	High	Excellent
Browser support	Native	Native	Via proxy/gRPC-Web
Learning curve	Low	Medium	High
HTTP caching	Excellent	Difficult	Manual
Real-time	Polling / SSE	Subscriptions	Bi-directional streaming
Type safety	OpenAPI (optional)	Built-in schema	Strict .proto contracts
Code generation	Optional	Optional	Required
Debugging	Easy (curl, browser)	GraphiQL	Harder (binary)
Public API support	✅ Ideal	✅ Good	❌ Not recommended
Microservices	✅ Good	⚠️ Overkill	✅ Ideal
Mobile bandwidth	Medium	High efficiency	Excellent efficiency
Ecosystem maturity	Excellent	Good	Growing

When to Use Each — Real Use Cases

Use REST when:

Public API: Third-party developers will call your API from any language, tool, or platform — REST has zero friction
CRUD operations: Standard create-read-update-delete mapping to HTTP verbs is REST's home turf
CDN-cached content: Static or infrequently updated resources that benefit from edge caching
Simple integrations: Webhooks, simple data retrieval, developer-first products where curl should work
Default choice: When no specific requirement pushes toward GraphQL or gRPC, REST is the right default

Use GraphQL when:

Multiple client types: Mobile app needs 5 fields, dashboard needs 40 — GraphQL lets each client define its own query
Highly relational data: Deep nested objects that would require 3+ REST round trips
Rapid frontend iteration: Frontend teams can add new data requirements without backend changes
API aggregation: One GraphQL endpoint aggregating multiple backend services
Bandwidth-sensitive clients: Mobile clients on 3G benefit from 40–60% smaller payloads

Use gRPC when:

Internal microservices: Service-to-service communication where both sides are controlled code
High-throughput systems: 50,000+ RPS where 5–10x performance improvement over REST is meaningful
Bi-directional streaming: Real-time data pipelines, live telemetry, IoT event streams
Multi-language microservices: Proto files generate type-safe clients in Go, Python, Java, Rust simultaneously
Low-latency requirements: Trading systems, game servers, real-time analytics pipelines

Migration Considerations

REST → GraphQL

The most common migration pattern: keep REST for public/external APIs, add a GraphQL layer for your internal frontend.

// GraphQL resolvers calling existing REST API internally
const resolvers = {
  Query: {
    api: async (_, { slug }) => {
      // Your existing REST logic — no need to rewrite
      return apiService.getBySlug(slug);
    }
  },
  API: {
    // N+1 problem: use DataLoader to batch DB calls
    reviews: async (api, _, { loaders }) => {
      return loaders.reviewsByApiId.load(api.id);
    }
  }
};

Watch out for: The N+1 query problem. GraphQL's resolver-per-field model can trigger hundreds of DB queries for a single request. Use DataLoader to batch and cache.

REST → gRPC

Migrating a REST service to gRPC typically starts with defining the proto contract, then running both protocols in parallel:

// Run REST and gRPC side by side during migration
// REST server stays live for existing clients
const expressApp = express();
expressApp.listen(3000);

// gRPC server for new internal clients
const grpcServer = new grpc.Server();
grpcServer.addService(APIServiceService, apiHandlers);
grpcServer.bindAsync('0.0.0.0:50051', credentials, () => grpcServer.start());

// Shared business logic — both protocols call the same service layer
async function getAPI(slug: string): Promise<API> {
  return db.apis.findOne({ slug });
}

Watch out for: Schema evolution. Proto fields are numbered — add new fields with new numbers, never reuse field numbers. REST APIs can add JSON fields freely; gRPC requires careful versioning of .proto files.

GraphQL → gRPC (BFF Pattern)

Teams often move from GraphQL to a Backend-For-Frontend (BFF) pattern using gRPC internally:

Browser → GraphQL BFF → gRPC microservices
                ↓
         (aggregates multiple
          gRPC calls into
          one GraphQL response)

The Pragmatic Hybrid Architecture

Most production systems use all three:

┌─────────────────────────────────────────────┐
│           Public / External                  │
│  REST API (OpenAPI spec, CDN cached)         │
│  → Third-party developers, webhooks,         │
│    simple integrations                       │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│           Frontend / BFF Layer               │
│  GraphQL API (Apollo, Yoga)                  │
│  → Web app, mobile app, flexible queries     │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│         Internal Microservices               │
│  gRPC (Protocol Buffers)                     │
│  → api-service, user-service, billing-service│
│    review-service, notification-service      │
└─────────────────────────────────────────────┘

This pattern separates concerns cleanly:

External developers get stable, cacheable REST endpoints
Frontend teams get flexible GraphQL for complex UIs
Infrastructure gets maximum-performance gRPC between services

Schema-First Design Across All Three Protocols

The most important architectural decision that transcends protocol choice is schema-first design: define the contract before writing implementation code. All three protocols have distinct mechanisms for this, but the underlying principle is universal.

OpenAPI (for REST) is a YAML or JSON document describing every endpoint, parameter, request body, and response type. Code generation tools like openapi-generator and zod-openapi produce server stubs, client SDKs, and validation logic from the spec — all consistent with the single source of truth. When the spec is authored first and code is generated from it, API changes flow from spec to implementation. Linting tools (Spectral, vacuum) can fail CI when implementation diverges from the spec, catching contract drift before it reaches production.

GraphQL's type system is intrinsically schema-first: you define the SDL (Schema Definition Language) and the type system becomes the contract. Clients introspect the schema at runtime (GET /graphql?query=__schema), enabling automatic documentation, code generation via graphql-codegen, and contract testing without a separate spec file.

gRPC's .proto files are the most strictly schema-first approach. Code generation isn't optional — you cannot use gRPC without running protoc to generate client and server stubs. This enforced contract-first workflow eliminates the class of drift that occurs in REST systems where implementation diverges from the OpenAPI spec over time. Proto field numbers create an additional constraint: once a field is assigned a number, it must retain that number for backward compatibility across all future schema versions.

The recommendation is identical for all three: define the contract first, generate code from it, and enforce contract consistency in CI. The tooling differs, but the discipline doesn't.

Protocol Selection for AI and LLM-Backed Endpoints

AI applications present specific protocol requirements that conventional web APIs don't encounter. Streaming token output requires a different model than request-response. REST with Server-Sent Events (SSE) is the emerging standard for LLM streaming — OpenAI, Anthropic, and Google all use REST + SSE for chat completion APIs. The pattern: Accept: text/event-stream header, server pushes data: events per token, client processes the stream incrementally. GraphQL subscriptions can handle this too, but the single-endpoint POST model conflicts with CDN caching of non-streaming calls.

For internal inference infrastructure, gRPC's bidirectional streaming is architecturally appropriate — large prompt payloads and streamed completions via streaming RPCs at lower latency than HTTP/1.1. Most LLM inference frameworks (vLLM, TGI, NVIDIA Triton) expose gRPC endpoints for production deployments precisely because of this throughput advantage.

For public AI APIs, REST + SSE wins: browser-compatible without a proxy, CDN-friendly for non-streaming endpoints, and consistent with what third-party developers already know how to consume. For edge-deployed AI features — prompt caching, RAG retrieval, embedding lookups — Cloudflare Workers' fetch event model integrates cleanly with REST-based LLM APIs while maintaining near-zero latency for cache hits. The internal inference backend can use gRPC while the public-facing API surface remains REST.

The API Integration Checklist (Free PDF)