Best Web Scraping APIs 2026: ScrapingBee vs Bright Data vs Apify vs Oxylabs
Scraping in 2026: Harder Than It Looks
A basic HTTP GET request returns raw HTML for most pages. For the rest — the ones backed by React, Angular, or Vue — JavaScript must execute before any content appears. Anti-bot measures (Cloudflare, DataDome, hCaptcha) block requests that look like automation. IP-based rate limiting blocks requests from data center IP ranges. Rotating user agents is table stakes; modern bot detection fingerprints Canvas, WebGL, timing, and behavioral signals.
Web scraping APIs solve the infrastructure problem: they maintain residential proxy pools, run headless browsers at scale, handle CAPTCHA solving, and rotate fingerprints — you send a URL, they return the rendered HTML or structured data.
In 2026, four platforms serve different points in the scraping stack: ScrapingBee (simple API for developers), Bright Data (enterprise proxy and scraping infrastructure), Apify (actor-based scraping platform with pre-built scrapers), and Oxylabs (data-center and residential proxies with Web Scraper API).
TL;DR
ScrapingBee is the simplest entry point — one API call returns rendered HTML, no proxy management required, starting at $49/month. Bright Data has the most capable and expensive infrastructure — the largest residential proxy network, a scraping browser, and unlocker service for heavily protected sites. Apify is the right choice when you need pre-built scrapers for specific sites (Amazon, Google Maps, LinkedIn) or want to build and host custom scrapers. Oxylabs offers competitive proxy infrastructure and a structured Web Scraper API for SERP, e-commerce, and real estate data.
Key Takeaways
- ScrapingBee charges $49/month for 150K API credits (1 basic request = 1 credit, 1 JS-rendered request = 5 credits, 1 premium proxy request = 25 credits).
- Bright Data's Web Unlocker starts at $1.50/1,000 requests — the highest-reliability unlocker for heavily protected sites (Amazon, LinkedIn, Zillow).
- Apify's free tier includes $5/month in compute credits, with paid plans starting at $49/month (25K compute units/month).
- Oxylabs Web Scraper API starts at $49 for 17,500 results ($2.80/1,000) for SERP and e-commerce structured data extraction.
- Residential proxies vs. data center proxies: Residential IPs appear to come from real users' ISPs — much harder to block but more expensive. Data center IPs are cheaper but blocked by sophisticated anti-bot systems.
- Headless browser vs. plain HTTP: Headless browsers (Chrome) render JavaScript before returning HTML — required for SPA pages but 5-10x more expensive per request than plain HTTP.
- Legal considerations: Web scraping legality varies by jurisdiction and target site's terms of service. Public data scraping is generally permitted; scraping behind login walls or accessing personal data has greater legal risk.
Pricing Comparison
| Platform | Free Tier | Paid Starting | Per 1,000 Requests |
|---|---|---|---|
| ScrapingBee | 1,000 credits trial | $49/month | ~$0.33-$8.25 (varies by type) |
| Bright Data Web Unlocker | Trial | $1.50/1,000 requests | $1.50 |
| Apify | $5 credits/month | $49/month | Varies by compute time |
| Oxylabs Web Scraper | No | $49/17.5K results | $2.80/1,000 |
ScrapingBee
Best for: Developers new to scraping, simple API integration, JavaScript rendering, no proxy management
ScrapingBee abstracts all scraping infrastructure behind a single REST API endpoint. You send a GET request with your target URL and parameters, ScrapingBee handles proxy rotation, browser rendering, and CAPTCHA solving, and returns HTML or JSON.
Pricing
| Plan | Cost | Credits/Month |
|---|---|---|
| Freelance | $49/month | 150,000 |
| Startup | $99/month | 500,000 |
| Business | $249/month | 1,500,000 |
| Enterprise | $599/month | 5,000,000 |
Credit costs per request type:
- Standard (no JS): 1 credit
- JavaScript rendering: 5 credits
- Premium proxies (residential): 25 credits
- Stealth proxy (high-protection sites): 75 credits
API Integration
import requests
# Basic HTML scraping
response = requests.get(
"https://app.scrapingbee.com/api/v1/",
params={
"api_key": "your-api-key",
"url": "https://example.com/products",
"render_js": "false", # No JavaScript rendering needed
},
)
html = response.text
# JavaScript-rendered page
response = requests.get(
"https://app.scrapingbee.com/api/v1/",
params={
"api_key": "your-api-key",
"url": "https://react-app.example.com/products",
"render_js": "true", # Render JavaScript
"wait": "2000", # Wait 2 seconds for JS to execute
"wait_for": "#product-list",# Wait for element to appear
"premium_proxy": "false",
},
)
JavaScript Snippet Execution
# Execute custom JavaScript before capturing the page
response = requests.get(
"https://app.scrapingbee.com/api/v1/",
params={
"api_key": "your-api-key",
"url": "https://example.com/modal-page",
"render_js": "true",
# Execute JS to close modals before capture
"js_snippet": "document.querySelector('.cookie-modal')?.remove(); document.querySelector('.signup-overlay')?.remove();",
},
)
When to Choose ScrapingBee
Developers building a scraper for the first time who want to skip proxy and browser management, applications that need occasional scraping of public pages (product pricing, news articles, public listings), or teams that want a simple pay-per-credit model without the complexity of proxy pool management.
Bright Data
Best for: Enterprise infrastructure, heavily protected sites, largest residential proxy network
Bright Data operates the largest commercial proxy network — 72M+ residential IPs across 195 countries. The platform serves enterprise data collection teams that need to bypass sophisticated anti-bot measures on the most protected sites on the internet (Amazon, LinkedIn, Zillow, Google).
Products and Pricing
| Product | Use Case | Starting Price |
|---|---|---|
| Web Unlocker | Bypass anti-bot, CAPTCHA solving | $1.50/1,000 requests |
| Scraping Browser | Full browser automation | $1.50-10/1,000 requests |
| Residential Proxies | IP rotation network | $8.50/GB |
| SERP API | Structured Google results | $1.50/1,000 results |
| Datasets | Pre-collected data | Custom |
Web Unlocker API
import requests
# Web Unlocker — handles anti-bot, CAPTCHA, fingerprinting automatically
proxies = {
"http": "http://brd-customer-hl_ACCOUNT_ID-zone-unlocker:PASSWORD@brd.superproxy.io:22225",
"https": "http://brd-customer-hl_ACCOUNT_ID-zone-unlocker:PASSWORD@brd.superproxy.io:22225",
}
# Point any HTTP client at Bright Data's proxy
response = requests.get(
"https://www.amazon.com/dp/B09XHVJ6Z3",
proxies=proxies,
verify=False, # Bright Data uses its own SSL cert
)
html = response.text
Scraping Browser (Playwright)
from playwright.async_api import async_playwright
async with async_playwright() as pw:
# Connect Playwright to Bright Data's scraping browser
# Bright Data handles fingerprinting, CAPTCHA, and anti-bot
browser = await pw.chromium.connect_over_cdp(
f"wss://brd-customer-hl_{ACCOUNT_ID}-zone-scraping_browser:{PASSWORD}@brd.superproxy.io:9222"
)
page = await browser.new_page()
await page.goto("https://www.linkedin.com/jobs/", timeout=60000)
# Interact with page as normal Playwright code
jobs = await page.query_selector_all(".job-card-container")
for job in jobs:
title = await job.query_selector(".job-card-list__title")
print(await title.inner_text())
await browser.close()
When to Choose Bright Data
Enterprise data collection teams scraping heavily protected sites (LinkedIn, Amazon, Zillow), organizations that need the highest-quality residential proxy coverage with 72M+ IPs, or teams building production-scale scrapers where reliability on protected sites justifies the premium pricing.
Apify
Best for: Pre-built scrapers for specific sites, actor marketplace, full-stack scraping platform
Apify is not just a proxy service — it's a full-stack web scraping platform. The Apify Store contains hundreds of pre-built "actors" (scrapers) for specific sites: Amazon products, Google Maps reviews, Instagram profiles, YouTube channels, LinkedIn companies, and hundreds more. You can run these actors without writing code, or build and host your own.
Pricing
| Plan | Cost | Compute Units/Month |
|---|---|---|
| Free | $0 | $5 equivalent |
| Starter | $49/month | 25,000 CUs |
| Scale | $149/month | 100,000 CUs |
| Business | $499/month | 400,000 CUs |
Compute units are consumed based on memory × time. A 1GB scraper running for 1 minute = 60 CUs.
Pre-Built Actors (No-Code)
# Run Amazon scraper via Apify CLI
npx apify-cli run apify/amazon-scraper \
--input='{"search": "bluetooth headphones", "maxResults": 100}' \
--token=YOUR_APIFY_TOKEN
# Or via API
curl -X POST "https://api.apify.com/v2/acts/apify~amazon-scraper/runs" \
-H "Authorization: Bearer $APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"search": "bluetooth headphones",
"maxResults": 100
}'
Custom Actor (Playwright)
// src/main.ts — Apify actor
import { Actor } from "apify";
import { PlaywrightCrawler } from "crawlee";
await Actor.init();
const crawler = new PlaywrightCrawler({
async requestHandler({ page, request, pushData }) {
const title = await page.title();
// Extract data from page
const products = await page.$$eval(".product-item", (items) =>
items.map((item) => ({
name: item.querySelector(".name")?.textContent?.trim(),
price: item.querySelector(".price")?.textContent?.trim(),
url: item.querySelector("a")?.href,
}))
);
await pushData({ url: request.url, title, products });
},
});
await crawler.run(["https://example.com/products"]);
await Actor.exit();
# Deploy your custom actor to Apify cloud
npx apify-cli push
# Run it on demand or schedule
apify actor:run --memory=1024
When to Choose Apify
Teams that need pre-built scrapers for specific popular sites (Amazon, Google Maps, LinkedIn, Instagram) without writing custom code, developers who want a managed platform for hosting and scheduling scrapers, or organizations that need a marketplace of ready-to-use data extraction actors.
Oxylabs
Best for: SERP and e-commerce structured data, residential proxy network, data center proxies
Oxylabs is the competitor to Bright Data at the enterprise level — a large residential proxy network alongside purpose-built Web Scraper APIs that return structured data (JSON) rather than raw HTML.
Products and Pricing
| Product | Starting Price | Output |
|---|---|---|
| SERP API | $49/17,500 results | Structured JSON from Google |
| E-Commerce Scraper API | $49/17,500 results | Product data JSON |
| Web Scraper API | $49/17,500 results | General HTML or JSON |
| Residential Proxies | $9/GB | Raw proxy |
| Data Center Proxies | $1.30/GB | Raw proxy |
SERP API
import requests
# Oxylabs SERP API — structured Google results
payload = {
"source": "google_search",
"query": "python web scraping",
"domain": "com",
"geo_location": "United States",
"locale": "en-us",
"parse": True, # Return structured JSON, not raw HTML
"pages": 3,
}
response = requests.post(
"https://realtime.oxylabs.io/v1/queries",
auth=("YOUR_USERNAME", "YOUR_PASSWORD"),
json=payload,
)
data = response.json()
results = data["results"][0]["content"]["results"]["organic"]
for result in results:
print(f"{result['pos']}. {result['title']}: {result['url']}")
E-Commerce Scraper
# Extract structured product data from Amazon
payload = {
"source": "amazon_product",
"query": "B09XHVJ6Z3", # ASIN
"parse": True,
}
response = requests.post(
"https://realtime.oxylabs.io/v1/queries",
auth=("USERNAME", "PASSWORD"),
json=payload,
)
product = response.json()["results"][0]["content"]
print(f"Title: {product['title']}")
print(f"Price: {product['price']}")
print(f"Rating: {product['rating']}")
print(f"Reviews: {product['reviews_count']}")
When to Choose Oxylabs
Teams that need structured data extraction (JSON output) from e-commerce and SERP sources without parsing HTML, organizations that need enterprise-grade proxy infrastructure as a Bright Data alternative, or projects where Oxylabs' competitive pricing on residential proxies ($9/GB vs Bright Data's $8.50/GB) matters at scale.
Choosing the Right Tool
| Scenario | Recommended |
|---|---|
| First-time scraper, simple pages | ScrapingBee |
| Pre-built Amazon/Google scraper | Apify |
| Custom scraper, manage infrastructure | Apify |
| Heavily protected sites (LinkedIn) | Bright Data Web Unlocker |
| Largest residential proxy pool | Bright Data |
| SERP data extraction (structured) | Oxylabs SERP API |
| E-commerce price monitoring | Oxylabs or ScrapingBee |
| Enterprise volume, full control | Bright Data |
| No-code scraping | Apify (pre-built actors) |
Self-Hosted Alternative
For teams with engineering capacity:
// Crawlee + Playwright — self-hosted scraper
import { PlaywrightCrawler, Dataset } from "crawlee";
const crawler = new PlaywrightCrawler({
// Crawlee handles browser pool, request queue, retry logic
maxConcurrency: 10,
async requestHandler({ page, request, enqueueLinks, pushData }) {
const title = await page.title();
const data = await page.evaluate(() => {
return Array.from(document.querySelectorAll(".product")).map(p => ({
name: p.querySelector(".name")?.textContent,
price: p.querySelector(".price")?.textContent,
}));
});
await pushData({ url: request.url, title, data });
await enqueueLinks({ selector: "a.pagination-next" }); // Follow pagination
},
});
await crawler.run(["https://example.com/products"]);
Self-hosted costs: VPS for browser compute + residential proxy subscription. Viable for teams with engineering capacity; Crawlee (by Apify) is open-source.
Verdict
ScrapingBee is the right starting point for most development teams — one API, clear credit pricing, JavaScript rendering included, no proxy management.
Bright Data is the enterprise choice when you need to reliably scrape the most protected sites at scale. The 72M+ residential IP pool and the Scraping Browser give you capabilities that smaller networks can't match.
Apify is the choice when you want the scraping work already done — the actor marketplace provides production-ready scrapers for the most common sites, deployable without writing any code.
Oxylabs is the competitive alternative to Bright Data for e-commerce and SERP data, with structured JSON output APIs that skip the HTML parsing step.
Compare web scraping API pricing, proxy network quality, and site coverage at APIScout — find the right data extraction platform for your use case.