Skip to main content

How to Monitor API Performance: Latency, Errors, and SLAs

·APIScout Team
api monitoringperformanceobservabilityslalatency

How to Monitor API Performance: Latency, Errors, and SLAs

You can't improve what you don't measure. API performance monitoring tracks latency, error rates, throughput, and availability — the metrics that determine whether your API is meeting its commitments. Here's what to measure, how to measure it, and when to alert.

The Four Golden Signals

Google SRE's four golden signals apply directly to APIs:

1. Latency

What: Time from request received to response sent.

Measure percentiles, not averages:

PercentileMeaningUse
p50 (median)Half of requests are fasterTypical experience
p9595% of requests are fasterMost users' experience
p9999% of requests are fasterWorst-case normal experience
p99.999.9% are fasterTail latency

Why not averages? An average of 100ms hides that 1% of requests take 5 seconds. p99 catches that.

Targets:

Endpoint Typep50p95p99
Simple read<50ms<200ms<500ms
Database query<100ms<500ms<1s
Search<200ms<1s<2s
Write operation<100ms<500ms<1s
External API call<500ms<2s<5s

2. Error Rate

What: Percentage of requests returning errors (4xx/5xx).

MetricHealthyWarningCritical
5xx rate<0.1%0.1-1%>1%
4xx rate<5%5-10%>10%
Total error rate<1%1-5%>5%

Track by status code: Distinguish between client errors (4xx — usually the client's fault) and server errors (5xx — your fault).

3. Throughput

What: Requests per second (RPS) or requests per minute (RPM).

Track throughput to:

  • Capacity plan (are you approaching limits?)
  • Detect anomalies (sudden spike = attack? sudden drop = outage?)
  • Correlate with latency (does latency increase with load?)

4. Saturation

What: How close your system is to capacity.

ResourceMetricAlert Threshold
CPUUtilization %>80% sustained
MemoryUsage / available>85%
Database connectionsActive / max pool>80%
Disk I/OIOPS / max IOPS>70%
NetworkBandwidth usage>70%

SLA / SLO / SLI

SLI (Service Level Indicator)

A measurable metric: "99.5% of requests complete in under 500ms."

SLO (Service Level Objective)

Your internal target: "p99 latency < 500ms, error rate < 0.1%."

SLA (Service Level Agreement)

Your external commitment with consequences: "99.9% uptime or service credits."

Set SLOs tighter than SLAs. If your SLA promises 99.9% uptime, set your SLO at 99.95% so you have a buffer before breaching the SLA.

Uptime Targets

UptimeDowntime/YearDowntime/Month
99%3.65 days7.3 hours
99.9%8.77 hours43.8 minutes
99.95%4.38 hours21.9 minutes
99.99%52.6 minutes4.38 minutes
99.999%5.26 minutes26.3 seconds

Alerting Strategy

Alert on Symptoms, Not Causes

Good alerts (symptoms):

  • p99 latency > 2s for 5 minutes
  • Error rate > 1% for 3 minutes
  • Throughput dropped 50% vs same hour last week

Bad alerts (causes):

  • CPU > 80% (may not affect users)
  • Memory > 90% (may be normal)
  • Single health check failed (transient)

Alert Severity

SeverityCriteriaResponse
P1 - CriticalService down, data lossPage on-call, all hands
P2 - HighDegraded performance, partial outagePage on-call, investigate
P3 - MediumNon-critical service degradedNext business day
P4 - LowCosmetic, minor issueBacklog

Monitoring Tools

ToolBest ForPrice
DatadogFull observabilityFrom $5/host/mo
Grafana + PrometheusSelf-hosted, open sourceFree
Better StackUptime + incidentsFree (10 monitors)
ChecklySynthetic monitoringFree (5 checks)
SentryError trackingFree (5K events)
PostHogProduct analyticsFree (1M events)

Dashboard Essentials

Every API monitoring dashboard should show:

  1. Request volume — RPS over time (detect anomalies)
  2. Latency percentiles — p50, p95, p99 over time
  3. Error rate — 4xx and 5xx separately
  4. Top errors — most frequent error codes/messages
  5. Slowest endpoints — which endpoints need optimization
  6. Uptime — current and 30-day availability

Monitoring your API? Explore monitoring tools and best practices on APIScout — comparisons, guides, and developer resources.

Comments