Microservices API Communication: Sync, Async, and Hybrid Patterns
Microservices API Communication: Sync, Async, and Hybrid Patterns
Microservices need to talk to each other. The choice between synchronous (request-response) and asynchronous (event-driven) communication determines your system's reliability, latency, coupling, and complexity. Most production systems use both — the art is knowing when to use which.
Synchronous vs. Asynchronous
| Dimension | Synchronous | Asynchronous |
|---|---|---|
| Pattern | Request → wait → response | Fire and forget / pub-sub |
| Coupling | Tight (caller needs receiver online) | Loose (queue buffers messages) |
| Latency | Depends on slowest service in chain | Caller returns immediately |
| Complexity | Simpler to implement | More infrastructure, harder to debug |
| Failure mode | Cascading failures | Message backlog, eventual consistency |
| Best for | Queries, real-time responses | Commands, background processing, events |
Synchronous Patterns
1. REST (HTTP/JSON)
The default choice for service-to-service communication:
Order Service → GET /api/users/123 → User Service
← { "id": "123", "name": "..." }
When to use:
- CRUD operations between services
- Simple request-response flows
- External-facing APIs
- Teams that need maximum interoperability
Trade-offs:
- Text-based (JSON) — larger payload than binary
- HTTP overhead per request
- No streaming support (without SSE)
- Schema validation not enforced by protocol
2. gRPC (HTTP/2 + Protocol Buffers)
High-performance binary protocol with code generation:
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User);
}
When to use:
- High-throughput internal communication (>10K RPS between services)
- Polyglot environments (auto-generated clients in any language)
- Streaming data (server-side, client-side, or bidirectional)
- Latency-sensitive paths
Performance comparison:
| Metric | REST (JSON) | gRPC (Protobuf) |
|---|---|---|
| Payload size | 1x (baseline) | 0.3-0.5x |
| Serialization speed | 1x | 5-10x faster |
| Connection overhead | New connection per request | Multiplexed on single connection |
| Streaming | Not native | Native bidirectional |
| Browser support | Universal | Via grpc-web proxy |
3. GraphQL (Federation)
For API composition across multiple services:
# Gateway federates across services
type Query {
order(id: ID!): Order # → Order Service
}
type Order {
id: ID!
user: User # → User Service (federated)
items: [OrderItem!]! # → Inventory Service (federated)
}
When to use:
- Mobile/frontend backends needing data from multiple services
- Complex data requirements with nested relationships
- When over-fetching is a measurable problem
Asynchronous Patterns
4. Message Queue (Point-to-Point)
One producer, one consumer. Messages are processed exactly once:
Order Service → [Order Queue] → Payment Service
↓
Process payment
↓
[Payment Queue] → Notification Service
Tools: RabbitMQ, Amazon SQS, Redis Streams, BullMQ
When to use:
- Background job processing
- Work distribution across workers
- Task queues with retry semantics
- When exactly-once processing matters
5. Event Streaming (Pub-Sub)
One producer, many consumers. Events are broadcast to all subscribers:
Order Service publishes "OrderCreated" event
→ Payment Service (subscribes: process payment)
→ Inventory Service (subscribes: reserve stock)
→ Analytics Service (subscribes: track metrics)
→ Email Service (subscribes: send confirmation)
Tools: Apache Kafka, Amazon SNS+SQS, Redis Pub/Sub, NATS
When to use:
- Event-driven architectures
- Multiple services need to react to the same event
- Event sourcing and CQRS
- Real-time data pipelines
- Audit logs and change data capture
6. Event Sourcing
Store every state change as an immutable event:
Events for Order #123:
1. OrderCreated { items: [...], total: 99.00 }
2. PaymentReceived { amount: 99.00 }
3. OrderShipped { tracking: "1Z999..." }
4. OrderDelivered { timestamp: "2026-03-08T14:00:00Z" }
Current state = replay all events
When to use:
- Audit trail is required (finance, healthcare, legal)
- Need to reconstruct state at any point in time
- Complex business processes with many state transitions
- CQRS (separate read and write models)
Hybrid Patterns
7. Saga Pattern (Distributed Transactions)
Coordinate multi-service transactions without distributed locks:
Choreography (event-driven):
Order Service → "OrderCreated"
→ Payment Service processes → "PaymentCompleted"
→ Inventory Service reserves → "StockReserved"
→ Shipping Service schedules → "OrderFulfilled"
If any step fails → Compensating events undo previous steps
Orchestration (centralized):
Saga Orchestrator:
1. Tell Payment Service: charge customer
2. If success → Tell Inventory: reserve stock
3. If success → Tell Shipping: schedule delivery
4. If any fail → Tell previous services: compensate
| Approach | Pros | Cons |
|---|---|---|
| Choreography | Decoupled, simple services | Hard to track flow, implicit logic |
| Orchestration | Clear flow, easy to monitor | Orchestrator is a single point of failure |
8. CQRS (Command Query Responsibility Segregation)
Separate read and write paths:
Commands (writes):
POST /orders → Order Service → Event Store → "OrderCreated" event
Queries (reads):
GET /orders → Read Service → Optimized read database (materialized views)
Events sync the read model:
"OrderCreated" → Read Service updates its denormalized view
When to use:
- Read and write patterns are very different
- Need to optimize reads independently (different database, different schema)
- Event sourcing is already in use
- High read-to-write ratio (100:1 or more)
Choosing the Right Pattern
Decision Framework
Is the caller waiting for a response?
├── Yes → Synchronous
│ ├── Internal, high throughput? → gRPC
│ ├── External or simple? → REST
│ └── Complex data needs? → GraphQL
└── No → Asynchronous
├── One consumer? → Message Queue
├── Multiple consumers? → Event Streaming
└── Distributed transaction? → Saga Pattern
Common Combinations
| System Type | Pattern Mix |
|---|---|
| E-commerce | REST (client-facing) + gRPC (internal) + Kafka (events) + Saga (orders) |
| SaaS platform | REST (API) + SQS (background jobs) + SNS (notifications) |
| Real-time app | WebSocket (clients) + gRPC (services) + Redis Pub/Sub (events) |
| Data pipeline | Kafka (streaming) + gRPC (processing) + REST (management API) |
Reliability Patterns
Retries with Backoff
Attempt 1: immediate
Attempt 2: wait 1 second
Attempt 3: wait 2 seconds
Attempt 4: wait 4 seconds
Attempt 5: wait 8 seconds → give up, dead letter queue
Circuit Breaker
Closed (normal) → 5 failures → Open (reject all)
↓ (30 second timeout)
Half-Open (allow 1 request)
↓ success → Closed
↓ failure → Open
Dead Letter Queues
Messages that fail processing after N retries go to a dead letter queue for manual investigation. Never lose messages — always have a DLQ.
Idempotency
Every message handler must be idempotent. Messages will be delivered more than once (at-least-once delivery is the standard guarantee). Use idempotency keys to prevent duplicate processing.
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| All synchronous | Cascading failures, tight coupling | Use async for non-blocking operations |
| All asynchronous | Hard to debug, eventual consistency everywhere | Use sync for queries and real-time needs |
| No dead letter queue | Lost messages, silent failures | Always configure DLQ |
| No circuit breaker | One failing service takes down everything | Add circuit breakers on all sync calls |
| Ignoring message ordering | Race conditions, inconsistent state | Use partitioned queues or sequence numbers |
| No observability | Can't trace requests across services | Distributed tracing (OpenTelemetry) |
Designing microservices communication? Explore API architecture patterns and tools on APIScout — comparisons, guides, and developer resources.