REST vs GraphQL vs gRPC: API Protocol Guide 2026
TL;DR
Three API protocols, three distinct trade-offs. REST is the universal default — works everywhere, caches by default, and every developer already knows it. GraphQL solves over/under-fetching for complex client-driven queries but adds caching complexity. gRPC delivers maximum performance for internal microservice communication but can't run in a browser without a proxy. Start with REST, layer in GraphQL for specific frontend needs, reserve gRPC for internal service-to-service calls where latency matters.
Key Takeaways
- REST handles 80% of API use cases with the least overhead — it's the right default
- GraphQL cuts payload sizes by 40–60% for clients with complex nested data requirements
- gRPC is 5–10x faster than REST for internal service communication due to binary Protocol Buffer serialization
- Mixing protocols is normal: public REST API + internal gRPC microservices is a common production pattern
- Browser support rules out raw gRPC for public APIs unless you add gRPC-Web or Connect RPC
- OpenAPI (for REST) and proto files (for gRPC) serve as contracts — prioritize schema-first design in all three
Why Protocol Choice Matters More Than Ever
In 2026, most teams are running a mix: a public REST API for third-party developers, a GraphQL layer for their own frontend, and gRPC between internal microservices. Understanding each protocol's trade-offs lets you make deliberate choices rather than defaulting to the first option you learned.
The choice isn't just performance — it's developer experience, client compatibility, caching behavior, and operational tooling. A protocol decision at the start of a project is hard to undo at scale.
REST — The Universal Standard
REST (Representational State Transfer) maps HTTP methods to resource operations. It's been the dominant API style since Roy Fielding's 2000 dissertation, and that dominance isn't accidental — REST aligns with how HTTP works, enabling caching, CDN distribution, and standard tooling out of the box.
How REST Works
# Resource-oriented endpoints, standard HTTP verbs
GET /apis/stripe # Fetch one API
GET /apis?category=payments # List with filter
POST /apis # Create
PUT /apis/stripe # Replace
PATCH /apis/stripe # Partial update
DELETE /apis/stripe # Delete
// Express REST API example
import express from 'express';
const app = express();
// GET /apis/:slug — fetch single API with caching headers
app.get('/apis/:slug', async (req, res) => {
const api = await db.apis.findOne({ slug: req.params.slug });
if (!api) return res.status(404).json({ error: 'API not found' });
// HTTP caching — REST's killer feature
res.set('Cache-Control', 'public, max-age=300'); // 5-min CDN cache
res.set('ETag', `"${api.updatedAt.getTime()}"`);
res.json(api);
});
// POST /apis — create with validation
app.post('/apis', authenticate, async (req, res) => {
const { name, url, category } = req.body;
if (!name || !url) {
return res.status(400).json({
error: 'Missing required fields',
required: ['name', 'url']
});
}
const api = await db.apis.create({ name, url, category });
res.status(201).json(api);
});
REST Caching — The Underrated Advantage
REST's biggest practical advantage is HTTP caching. Every GET request can be cached by browsers, CDNs, and reverse proxies automatically. A Cloudflare CDN can serve a popular API's data from the edge in milliseconds with zero load on your origin server.
# HTTP cache headers in REST — free performance
Cache-Control: public, max-age=3600 # CDN + browser cache 1 hour
ETag: "abc123" # Conditional requests
Last-Modified: Tue, 18 Mar 2026 10:00:00 GMT
# Client conditional request (only fetch if changed)
GET /apis/stripe
If-None-Match: "abc123"
# Returns 304 Not Modified if unchanged — zero data transfer
REST Weaknesses in Practice
Over-fetching is the most common complaint. A mobile app showing a list of API names and prices gets the full API object — including documentation, changelog history, and tag arrays — even though it only needs 3 fields:
// Client needs: name + price + uptime
// REST returns the full object: 40+ fields
{
"id": "api_123",
"name": "Stripe",
"slug": "stripe",
"category": "payments",
"price": "$0.029/transaction",
"uptime": 99.99,
"documentation_url": "...",
"changelog": [...],
"tags": [...],
"ratings": {...},
// 30+ more fields the client doesn't need
}
Under-fetching requires multiple round trips. Fetching an API with its latest reviews requires separate requests:
GET /apis/stripe # 1st request: get API
GET /apis/stripe/reviews # 2nd request: get reviews
GET /users/456 # 3rd request: get review author
GraphQL — The Client-Driven Query Language
GraphQL inverts the data-fetching model. Instead of the server defining what each endpoint returns, clients specify exactly what they need in a query. Facebook built GraphQL to solve over/under-fetching across their many mobile clients with different data requirements.
How GraphQL Works
# Client requests exactly what it needs
query GetApiWithReviews {
api(slug: "stripe") {
name
pricing {
tier
price
}
uptime
reviews(first: 5) {
rating
comment
author {
name
}
}
}
}
// GraphQL server with Apollo Server
import { ApolloServer, gql } from '@apollo/server';
const typeDefs = gql`
type Query {
api(slug: String!): API
apis(category: String, limit: Int): [API!]!
}
type API {
id: ID!
name: String!
slug: String!
uptime: Float
pricing: Pricing
reviews(first: Int): [Review!]!
}
type Pricing {
tier: String!
price: String!
freeTrialDays: Int
}
type Review {
rating: Int!
comment: String
author: User!
}
`;
const resolvers = {
Query: {
api: (_, { slug }) => db.apis.findOne({ slug }),
apis: (_, { category, limit = 20 }) =>
db.apis.findAll({ where: { category }, limit }),
},
API: {
reviews: (api, { first = 10 }) =>
db.reviews.findAll({ where: { apiId: api.id }, limit: first }),
},
};
GraphQL Subscriptions for Real-Time
GraphQL's subscription type adds real-time capabilities over WebSockets:
subscription OnApiStatusChange {
apiStatusChanged(slug: "stripe") {
status
uptime
latency
updatedAt
}
}
// Client subscription with Apollo Client
import { useSubscription } from '@apollo/client';
function ApiStatus({ slug }) {
const { data } = useSubscription(ON_API_STATUS_CHANGE, {
variables: { slug }
});
return <StatusBadge uptime={data?.apiStatusChanged.uptime} />;
}
GraphQL Caching — The Real Challenge
HTTP caching is GraphQL's main operational headache. All queries go to a single POST /graphql endpoint — CDNs don't cache POST requests by default.
// Persisted queries work around GraphQL caching limitations
// Client sends a hash of the query instead of the full query string
// Server maps hash → query → cached result
// Apollo Client: automatic persisted queries
const client = new ApolloClient({
link: createPersistedQueryLink().concat(httpLink),
});
// Server-side caching per query hash
const cache = new KeyValueCache();
const server = new ApolloServer({
plugins: [
ApolloServerPluginCacheControl({ defaultMaxAge: 300 }),
],
});
gRPC — Maximum Performance for Internal APIs
gRPC uses Protocol Buffers (protobufs) for serialization — a binary format that's 3–10x smaller than JSON and 5–10x faster to serialize/deserialize. Google designed gRPC for internal service communication where performance at scale is non-negotiable.
How gRPC Works
Define the API contract in a .proto file:
// api.proto
syntax = "proto3";
package apiscout;
service APIService {
rpc GetAPI (GetAPIRequest) returns (API);
rpc ListAPIs (ListAPIsRequest) returns (ListAPIsResponse);
rpc StreamAPIUpdates (StreamRequest) returns (stream APIUpdate);
rpc BatchGetAPIs (stream BatchRequest) returns (stream API);
}
message GetAPIRequest {
string slug = 1;
}
message API {
string id = 1;
string name = 2;
string slug = 3;
double uptime = 4;
repeated string tags = 5;
Pricing pricing = 6;
}
message Pricing {
string tier = 1;
string price = 2;
}
Generate code in any language:
# Generate TypeScript server + client stubs
npx ts-proto \
--ts_proto_opt=outputServices=grpc-js \
--plugin=node_modules/.bin/protoc-gen-ts_proto \
api.proto
// gRPC server implementation (Node.js)
import * as grpc from '@grpc/grpc-js';
import { APIServiceService } from './generated/api_grpc_pb';
const server = new grpc.Server();
server.addService(APIServiceService, {
async getAPI(call, callback) {
const api = await db.apis.findOne({ slug: call.request.slug });
if (!api) {
return callback({
code: grpc.status.NOT_FOUND,
message: `API "${call.request.slug}" not found`,
});
}
callback(null, api);
},
// Server-side streaming — push updates as they happen
streamAPIUpdates(call) {
const slug = call.request.slug;
const unsubscribe = eventBus.on(`api.${slug}.updated`, (update) => {
call.write(update);
});
call.on('cancelled', unsubscribe);
},
});
server.bindAsync('0.0.0.0:50051', grpc.ServerCredentials.createInsecure(), () => {
server.start();
});
// gRPC client (another microservice)
import * as grpc from '@grpc/grpc-js';
import { APIServiceClient } from './generated/api_grpc_pb';
const client = new APIServiceClient(
'api-service:50051',
grpc.credentials.createInsecure()
);
// Unary call — like a REST GET
client.getAPI({ slug: 'stripe' }, (err, response) => {
console.log(response.name, response.uptime);
});
// Bidirectional streaming — real-time, both sides send
const stream = client.batchGetAPIs();
stream.on('data', (api) => console.log('Received:', api.name));
['stripe', 'twilio', 'sendgrid'].forEach(slug => {
stream.write({ slug });
});
stream.end();
gRPC in the Browser: gRPC-Web and Connect RPC
Raw gRPC requires HTTP/2 trailers, which browsers don't support. For browser clients, use gRPC-Web or Connect RPC:
// Connect RPC — gRPC-compatible, works in browsers without a proxy
import { createClient } from "@connectrpc/connect";
import { createConnectTransport } from "@connectrpc/connect-web";
import { APIService } from "./generated/api_connect";
const transport = createConnectTransport({
baseUrl: "https://api.apiscout.com",
});
const client = createClient(APIService, transport);
const api = await client.getAPI({ slug: "stripe" });
Performance Benchmarks
Real-world performance comparison for a typical API response (API details + 5 reviews + pricing):
| Metric | REST (JSON) | GraphQL | gRPC (Protobuf) |
|---|---|---|---|
| Payload size | 4,200 bytes | 1,800 bytes | 380 bytes |
| Serialization | 1.0x baseline | 1.1x | 0.12x |
| Deserialization | 1.0x baseline | 1.1x | 0.09x |
| Latency (LAN) | ~2ms | ~3ms | ~0.4ms |
| Latency (WAN) | ~45ms | ~42ms | ~38ms |
| Requests/sec (single core) | 18,000 | 14,000 | 95,000 |
Key takeaways from the benchmarks:
- Payload size: gRPC is 11x smaller than REST due to binary Protobuf encoding
- LAN latency: gRPC is 5x faster than REST — meaningful for microservices making hundreds of inter-service calls per user request
- WAN latency: Differences shrink significantly over the network — REST vs GraphQL vs gRPC are within 15% on WAN
- Serialization speed: Protobuf serialization is 8–10x faster than JSON, reducing CPU cost at scale
Full Comparison Table
| Factor | REST | GraphQL | gRPC |
|---|---|---|---|
| Performance | Good | Good | Excellent |
| Payload efficiency | Medium | High | Excellent |
| Browser support | Native | Native | Via proxy/gRPC-Web |
| Learning curve | Low | Medium | High |
| HTTP caching | Excellent | Difficult | Manual |
| Real-time | Polling / SSE | Subscriptions | Bi-directional streaming |
| Type safety | OpenAPI (optional) | Built-in schema | Strict .proto contracts |
| Code generation | Optional | Optional | Required |
| Debugging | Easy (curl, browser) | GraphiQL | Harder (binary) |
| Public API support | ✅ Ideal | ✅ Good | ❌ Not recommended |
| Microservices | ✅ Good | ⚠️ Overkill | ✅ Ideal |
| Mobile bandwidth | Medium | High efficiency | Excellent efficiency |
| Ecosystem maturity | Excellent | Good | Growing |
When to Use Each — Real Use Cases
Use REST when:
- Public API: Third-party developers will call your API from any language, tool, or platform — REST has zero friction
- CRUD operations: Standard create-read-update-delete mapping to HTTP verbs is REST's home turf
- CDN-cached content: Static or infrequently updated resources that benefit from edge caching
- Simple integrations: Webhooks, simple data retrieval, developer-first products where
curlshould work - Default choice: When no specific requirement pushes toward GraphQL or gRPC, REST is the right default
Use GraphQL when:
- Multiple client types: Mobile app needs 5 fields, dashboard needs 40 — GraphQL lets each client define its own query
- Highly relational data: Deep nested objects that would require 3+ REST round trips
- Rapid frontend iteration: Frontend teams can add new data requirements without backend changes
- API aggregation: One GraphQL endpoint aggregating multiple backend services
- Bandwidth-sensitive clients: Mobile clients on 3G benefit from 40–60% smaller payloads
Use gRPC when:
- Internal microservices: Service-to-service communication where both sides are controlled code
- High-throughput systems: 50,000+ RPS where 5–10x performance improvement over REST is meaningful
- Bi-directional streaming: Real-time data pipelines, live telemetry, IoT event streams
- Multi-language microservices: Proto files generate type-safe clients in Go, Python, Java, Rust simultaneously
- Low-latency requirements: Trading systems, game servers, real-time analytics pipelines
Migration Considerations
REST → GraphQL
The most common migration pattern: keep REST for public/external APIs, add a GraphQL layer for your internal frontend.
// GraphQL resolvers calling existing REST API internally
const resolvers = {
Query: {
api: async (_, { slug }) => {
// Your existing REST logic — no need to rewrite
return apiService.getBySlug(slug);
}
},
API: {
// N+1 problem: use DataLoader to batch DB calls
reviews: async (api, _, { loaders }) => {
return loaders.reviewsByApiId.load(api.id);
}
}
};
Watch out for: The N+1 query problem. GraphQL's resolver-per-field model can trigger hundreds of DB queries for a single request. Use DataLoader to batch and cache.
REST → gRPC
Migrating a REST service to gRPC typically starts with defining the proto contract, then running both protocols in parallel:
// Run REST and gRPC side by side during migration
// REST server stays live for existing clients
const expressApp = express();
expressApp.listen(3000);
// gRPC server for new internal clients
const grpcServer = new grpc.Server();
grpcServer.addService(APIServiceService, apiHandlers);
grpcServer.bindAsync('0.0.0.0:50051', credentials, () => grpcServer.start());
// Shared business logic — both protocols call the same service layer
async function getAPI(slug: string): Promise<API> {
return db.apis.findOne({ slug });
}
Watch out for: Schema evolution. Proto fields are numbered — add new fields with new numbers, never reuse field numbers. REST APIs can add JSON fields freely; gRPC requires careful versioning of .proto files.
GraphQL → gRPC (BFF Pattern)
Teams often move from GraphQL to a Backend-For-Frontend (BFF) pattern using gRPC internally:
Browser → GraphQL BFF → gRPC microservices
↓
(aggregates multiple
gRPC calls into
one GraphQL response)
The Pragmatic Hybrid Architecture
Most production systems use all three:
┌─────────────────────────────────────────────┐
│ Public / External │
│ REST API (OpenAPI spec, CDN cached) │
│ → Third-party developers, webhooks, │
│ simple integrations │
└──────────────────┬──────────────────────────┘
│
┌──────────────────▼──────────────────────────┐
│ Frontend / BFF Layer │
│ GraphQL API (Apollo, Yoga) │
│ → Web app, mobile app, flexible queries │
└──────────────────┬──────────────────────────┘
│
┌──────────────────▼──────────────────────────┐
│ Internal Microservices │
│ gRPC (Protocol Buffers) │
│ → api-service, user-service, billing-service│
│ review-service, notification-service │
└─────────────────────────────────────────────┘
This pattern separates concerns cleanly:
- External developers get stable, cacheable REST endpoints
- Frontend teams get flexible GraphQL for complex UIs
- Infrastructure gets maximum-performance gRPC between services
Schema-First Design Across All Three Protocols
The most important architectural decision that transcends protocol choice is schema-first design: define the contract before writing implementation code. All three protocols have distinct mechanisms for this, but the underlying principle is universal.
OpenAPI (for REST) is a YAML or JSON document describing every endpoint, parameter, request body, and response type. Code generation tools like openapi-generator and zod-openapi produce server stubs, client SDKs, and validation logic from the spec — all consistent with the single source of truth. When the spec is authored first and code is generated from it, API changes flow from spec to implementation. Linting tools (Spectral, vacuum) can fail CI when implementation diverges from the spec, catching contract drift before it reaches production.
GraphQL's type system is intrinsically schema-first: you define the SDL (Schema Definition Language) and the type system becomes the contract. Clients introspect the schema at runtime (GET /graphql?query=__schema), enabling automatic documentation, code generation via graphql-codegen, and contract testing without a separate spec file.
gRPC's .proto files are the most strictly schema-first approach. Code generation isn't optional — you cannot use gRPC without running protoc to generate client and server stubs. This enforced contract-first workflow eliminates the class of drift that occurs in REST systems where implementation diverges from the OpenAPI spec over time. Proto field numbers create an additional constraint: once a field is assigned a number, it must retain that number for backward compatibility across all future schema versions.
The recommendation is identical for all three: define the contract first, generate code from it, and enforce contract consistency in CI. The tooling differs, but the discipline doesn't.
Protocol Selection for AI and LLM-Backed Endpoints
AI applications present specific protocol requirements that conventional web APIs don't encounter. Streaming token output requires a different model than request-response. REST with Server-Sent Events (SSE) is the emerging standard for LLM streaming — OpenAI, Anthropic, and Google all use REST + SSE for chat completion APIs. The pattern: Accept: text/event-stream header, server pushes data: events per token, client processes the stream incrementally. GraphQL subscriptions can handle this too, but the single-endpoint POST model conflicts with CDN caching of non-streaming calls.
For internal inference infrastructure, gRPC's bidirectional streaming is architecturally appropriate — large prompt payloads and streamed completions via streaming RPCs at lower latency than HTTP/1.1. Most LLM inference frameworks (vLLM, TGI, NVIDIA Triton) expose gRPC endpoints for production deployments precisely because of this throughput advantage.
For public AI APIs, REST + SSE wins: browser-compatible without a proxy, CDN-friendly for non-streaming endpoints, and consistent with what third-party developers already know how to consume. For edge-deployed AI features — prompt caching, RAG retrieval, embedding lookups — Cloudflare Workers' fetch event model integrates cleanly with REST-based LLM APIs while maintaining near-zero latency for cache hits. The internal inference backend can use gRPC while the public-facing API surface remains REST.
Related: GraphQL vs REST — When Each Makes Sense · gRPC vs Connect RPC vs tRPC · REST vs GraphQL vs gRPC vs tRPC: The Full Four-Way Comparison · Microservices API Communication Patterns