Working with Paginated APIs: Best Practices
Working with Paginated APIs: Best Practices
Every API that returns lists uses pagination. Get it wrong and you miss data, create duplicate entries, or overwhelm the API. Get it right and you efficiently process millions of records without breaking a sweat.
Pagination Types
1. Offset-Based
The simplest but most problematic approach.
// Request
GET /api/users?offset=0&limit=20
GET /api/users?offset=20&limit=20
GET /api/users?offset=40&limit=20
// Response
{
"data": [...],
"total": 1000,
"offset": 0,
"limit": 20
}
Problems:
- If a record is inserted during pagination, you get duplicates
- If a record is deleted, you skip one
OFFSET 10000 LIMIT 20is slow in databases (scans 10,000 rows)
2. Cursor-Based
The standard for modern APIs. Returns an opaque cursor pointing to the next page.
// Request
GET /api/users?limit=20
GET /api/users?limit=20&cursor=eyJpZCI6MjB9
GET /api/users?limit=20&cursor=eyJpZCI6NDB9
// Response
{
"data": [...],
"next_cursor": "eyJpZCI6NDB9",
"has_more": true
}
Advantages:
- Consistent results even with concurrent inserts/deletes
- Fast regardless of page depth (no OFFSET scan)
- No duplicates or missed records
3. Keyset / After-Based
Similar to cursor but uses a visible field (usually ID or timestamp).
// Request
GET /api/events?after=2026-01-01T00:00:00Z&limit=100
GET /api/events?after=2026-01-01T05:30:00Z&limit=100
// Response
{
"data": [...],
"last_timestamp": "2026-01-01T05:30:00Z",
"has_more": true
}
4. Page-Based
Simple page numbers.
// Request
GET /api/products?page=1&per_page=50
GET /api/products?page=2&per_page=50
// Response
{
"data": [...],
"page": 1,
"per_page": 50,
"total_pages": 20,
"total": 1000
}
Comparison
| Type | Speed at Depth | Consistency | Can Jump to Page | Complexity |
|---|---|---|---|---|
| Offset | Slow at high offsets | Inconsistent with concurrent writes | Yes | Simple |
| Cursor | Fast always | Consistent | No | Medium |
| Keyset | Fast always | Consistent | Sort of (by value) | Medium |
| Page | Slow at depth | Inconsistent | Yes | Simple |
Pattern 1: Async Iterator (All Pages)
The cleanest way to iterate through all pages:
async function* paginateAll<T>(
fetchPage: (cursor?: string) => Promise<{
data: T[];
nextCursor?: string;
hasMore: boolean;
}>
): AsyncGenerator<T> {
let cursor: string | undefined;
let hasMore = true;
while (hasMore) {
const page = await fetchPage(cursor);
for (const item of page.data) {
yield item;
}
cursor = page.nextCursor;
hasMore = page.hasMore;
}
}
// Usage — clean, memory-efficient
const allUsers = paginateAll(async (cursor) => {
const response = await fetch(
`https://api.example.com/users?limit=100${cursor ? `&cursor=${cursor}` : ''}`
);
return response.json();
});
for await (const user of allUsers) {
await processUser(user);
}
Collecting All Results
async function fetchAllPages<T>(
fetchPage: (cursor?: string) => Promise<{
data: T[];
nextCursor?: string;
hasMore: boolean;
}>
): Promise<T[]> {
const allResults: T[] = [];
for await (const item of paginateAll(fetchPage)) {
allResults.push(item);
}
return allResults;
}
// Usage
const allUsers = await fetchAllPages(async (cursor) => {
const res = await fetch(`/api/users?limit=100${cursor ? `&cursor=${cursor}` : ''}`);
return res.json();
});
Pattern 2: Parallel Page Fetching
When the API supports it and you know the total pages:
async function fetchPagesParallel<T>(
totalPages: number,
fetchPage: (page: number) => Promise<T[]>,
concurrency: number = 5
): Promise<T[]> {
const allResults: T[][] = new Array(totalPages);
let currentPage = 0;
async function worker() {
while (currentPage < totalPages) {
const page = currentPage++;
allResults[page] = await fetchPage(page + 1);
}
}
// Run N concurrent workers
await Promise.all(
Array.from({ length: Math.min(concurrency, totalPages) }, () => worker())
);
return allResults.flat();
}
// Usage
// First, get total pages
const firstPage = await fetch('/api/products?page=1&per_page=50').then(r => r.json());
const totalPages = firstPage.total_pages;
const allProducts = await fetchPagesParallel(
totalPages,
async (page) => {
const res = await fetch(`/api/products?page=${page}&per_page=50`);
const data = await res.json();
return data.data;
},
5 // 5 concurrent requests
);
Warning: Only works with page-based or offset-based pagination (not cursor-based). Respect rate limits.
Pattern 3: Streaming Large Datasets
For millions of records, don't load everything into memory:
async function streamPaginatedData<T>(
fetchPage: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string; hasMore: boolean }>,
processBatch: (batch: T[]) => Promise<void>,
options: { batchSize?: number; delayMs?: number } = {}
): Promise<{ processed: number }> {
const { delayMs = 0 } = options;
let cursor: string | undefined;
let hasMore = true;
let processed = 0;
while (hasMore) {
const page = await fetchPage(cursor);
await processBatch(page.data);
processed += page.data.length;
cursor = page.nextCursor;
hasMore = page.hasMore;
// Optional delay between pages (respect rate limits)
if (delayMs > 0 && hasMore) {
await new Promise(r => setTimeout(r, delayMs));
}
// Log progress
if (processed % 10000 === 0) {
console.log(`Processed ${processed} records...`);
}
}
return { processed };
}
// Usage: process 1M records without loading all into memory
await streamPaginatedData(
async (cursor) => {
const res = await fetch(`/api/events?limit=500${cursor ? `&cursor=${cursor}` : ''}`);
return res.json();
},
async (batch) => {
// Insert into database in batches
await db.events.insertMany(batch);
},
{ delayMs: 100 } // 100ms between pages to respect rate limits
);
Pattern 4: Provider-Specific Pagination
Stripe
// Stripe uses cursor-based pagination with `starting_after`
async function getAllStripeCustomers() {
const customers: Stripe.Customer[] = [];
let hasMore = true;
let startingAfter: string | undefined;
while (hasMore) {
const page = await stripe.customers.list({
limit: 100,
starting_after: startingAfter,
});
customers.push(...page.data);
hasMore = page.has_more;
startingAfter = page.data[page.data.length - 1]?.id;
}
return customers;
}
// Or use Stripe's auto-pagination
for await (const customer of stripe.customers.list({ limit: 100 })) {
await processCustomer(customer);
}
GitHub
// GitHub uses Link headers for pagination
async function getAllRepos(org: string) {
const repos = [];
let url: string | null = `https://api.github.com/orgs/${org}/repos?per_page=100`;
while (url) {
const response = await fetch(url, {
headers: { 'Authorization': `Bearer ${GITHUB_TOKEN}` },
});
const data = await response.json();
repos.push(...data);
// Parse Link header for next page
const linkHeader = response.headers.get('Link');
const nextLink = linkHeader?.match(/<([^>]+)>;\s*rel="next"/);
url = nextLink ? nextLink[1] : null;
}
return repos;
}
GraphQL (Relay-style)
// Relay-style cursor pagination
async function getAllUsers() {
const users = [];
let hasNextPage = true;
let endCursor: string | null = null;
while (hasNextPage) {
const query = `
query ($after: String) {
users(first: 50, after: $after) {
edges {
node { id name email }
cursor
}
pageInfo {
hasNextPage
endCursor
}
}
}
`;
const result = await graphqlClient.request(query, { after: endCursor });
users.push(...result.users.edges.map((e: any) => e.node));
hasNextPage = result.users.pageInfo.hasNextPage;
endCursor = result.users.pageInfo.endCursor;
}
return users;
}
Edge Cases
Handling Empty Pages
// Some APIs return empty pages before actually being done
async function* robustPaginate<T>(fetchPage: (cursor?: string) => Promise<{
data: T[];
nextCursor?: string;
hasMore: boolean;
}>) {
let cursor: string | undefined;
let hasMore = true;
let emptyPageCount = 0;
while (hasMore) {
const page = await fetchPage(cursor);
if (page.data.length === 0) {
emptyPageCount++;
if (emptyPageCount > 3) {
// Too many empty pages — something is wrong, stop
console.warn('Too many consecutive empty pages, stopping pagination');
break;
}
} else {
emptyPageCount = 0;
for (const item of page.data) {
yield item;
}
}
cursor = page.nextCursor;
hasMore = page.hasMore;
}
}
Handling Rate Limits During Pagination
async function paginateWithRateLimit<T>(
fetchPage: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string; hasMore: boolean }>,
): Promise<T[]> {
const results: T[] = [];
let cursor: string | undefined;
let hasMore = true;
while (hasMore) {
try {
const page = await fetchPage(cursor);
results.push(...page.data);
cursor = page.nextCursor;
hasMore = page.hasMore;
} catch (error: any) {
if (error.status === 429) {
const retryAfter = error.headers?.['retry-after'] || 5;
console.log(`Rate limited, waiting ${retryAfter}s...`);
await new Promise(r => setTimeout(r, retryAfter * 1000));
continue; // Retry same page
}
throw error;
}
}
return results;
}
Common Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Loading all pages into memory | OOM for large datasets | Stream with async generators |
Ignoring has_more / relying on empty page | Missing last page or infinite loop | Always check has_more flag |
| Not handling rate limits | Pagination fails mid-way | Retry with backoff on 429 |
| Using offset pagination at depth | Slow queries, inconsistent results | Use cursor-based if available |
| Parallel fetch with cursor pagination | Cursors are sequential | Only parallelize page/offset-based |
| Not persisting progress | Restart from beginning on failure | Save last cursor, resume on retry |
| Hardcoding page size | Too small = many requests, too large = timeouts | Match API's recommended or max page size |
Compare API pagination patterns across providers on APIScout — find which APIs offer cursor-based pagination, auto-pagination in SDKs, and streaming endpoints.