Skip to main content

Best Web Scraping APIs 2026: ScrapingBee vs Bright Data vs Apify vs Oxylabs

·APIScout Team
web scrapingscrapingbeebright dataapifyoxylabsproxy apidata extractionheadless browser api

Scraping in 2026: Harder Than It Looks

A basic HTTP GET request returns raw HTML for most pages. For the rest — the ones backed by React, Angular, or Vue — JavaScript must execute before any content appears. Anti-bot measures (Cloudflare, DataDome, hCaptcha) block requests that look like automation. IP-based rate limiting blocks requests from data center IP ranges. Rotating user agents is table stakes; modern bot detection fingerprints Canvas, WebGL, timing, and behavioral signals.

Web scraping APIs solve the infrastructure problem: they maintain residential proxy pools, run headless browsers at scale, handle CAPTCHA solving, and rotate fingerprints — you send a URL, they return the rendered HTML or structured data.

In 2026, four platforms serve different points in the scraping stack: ScrapingBee (simple API for developers), Bright Data (enterprise proxy and scraping infrastructure), Apify (actor-based scraping platform with pre-built scrapers), and Oxylabs (data-center and residential proxies with Web Scraper API).

TL;DR

ScrapingBee is the simplest entry point — one API call returns rendered HTML, no proxy management required, starting at $49/month. Bright Data has the most capable and expensive infrastructure — the largest residential proxy network, a scraping browser, and unlocker service for heavily protected sites. Apify is the right choice when you need pre-built scrapers for specific sites (Amazon, Google Maps, LinkedIn) or want to build and host custom scrapers. Oxylabs offers competitive proxy infrastructure and a structured Web Scraper API for SERP, e-commerce, and real estate data.

Key Takeaways

  • ScrapingBee charges $49/month for 150K API credits (1 basic request = 1 credit, 1 JS-rendered request = 5 credits, 1 premium proxy request = 25 credits).
  • Bright Data's Web Unlocker starts at $1.50/1,000 requests — the highest-reliability unlocker for heavily protected sites (Amazon, LinkedIn, Zillow).
  • Apify's free tier includes $5/month in compute credits, with paid plans starting at $49/month (25K compute units/month).
  • Oxylabs Web Scraper API starts at $49 for 17,500 results ($2.80/1,000) for SERP and e-commerce structured data extraction.
  • Residential proxies vs. data center proxies: Residential IPs appear to come from real users' ISPs — much harder to block but more expensive. Data center IPs are cheaper but blocked by sophisticated anti-bot systems.
  • Headless browser vs. plain HTTP: Headless browsers (Chrome) render JavaScript before returning HTML — required for SPA pages but 5-10x more expensive per request than plain HTTP.
  • Legal considerations: Web scraping legality varies by jurisdiction and target site's terms of service. Public data scraping is generally permitted; scraping behind login walls or accessing personal data has greater legal risk.

Pricing Comparison

PlatformFree TierPaid StartingPer 1,000 Requests
ScrapingBee1,000 credits trial$49/month~$0.33-$8.25 (varies by type)
Bright Data Web UnlockerTrial$1.50/1,000 requests$1.50
Apify$5 credits/month$49/monthVaries by compute time
Oxylabs Web ScraperNo$49/17.5K results$2.80/1,000

ScrapingBee

Best for: Developers new to scraping, simple API integration, JavaScript rendering, no proxy management

ScrapingBee abstracts all scraping infrastructure behind a single REST API endpoint. You send a GET request with your target URL and parameters, ScrapingBee handles proxy rotation, browser rendering, and CAPTCHA solving, and returns HTML or JSON.

Pricing

PlanCostCredits/Month
Freelance$49/month150,000
Startup$99/month500,000
Business$249/month1,500,000
Enterprise$599/month5,000,000

Credit costs per request type:

  • Standard (no JS): 1 credit
  • JavaScript rendering: 5 credits
  • Premium proxies (residential): 25 credits
  • Stealth proxy (high-protection sites): 75 credits

API Integration

import requests

# Basic HTML scraping
response = requests.get(
    "https://app.scrapingbee.com/api/v1/",
    params={
        "api_key": "your-api-key",
        "url": "https://example.com/products",
        "render_js": "false",       # No JavaScript rendering needed
    },
)
html = response.text

# JavaScript-rendered page
response = requests.get(
    "https://app.scrapingbee.com/api/v1/",
    params={
        "api_key": "your-api-key",
        "url": "https://react-app.example.com/products",
        "render_js": "true",        # Render JavaScript
        "wait": "2000",             # Wait 2 seconds for JS to execute
        "wait_for": "#product-list",# Wait for element to appear
        "premium_proxy": "false",
    },
)

JavaScript Snippet Execution

# Execute custom JavaScript before capturing the page
response = requests.get(
    "https://app.scrapingbee.com/api/v1/",
    params={
        "api_key": "your-api-key",
        "url": "https://example.com/modal-page",
        "render_js": "true",
        # Execute JS to close modals before capture
        "js_snippet": "document.querySelector('.cookie-modal')?.remove(); document.querySelector('.signup-overlay')?.remove();",
    },
)

When to Choose ScrapingBee

Developers building a scraper for the first time who want to skip proxy and browser management, applications that need occasional scraping of public pages (product pricing, news articles, public listings), or teams that want a simple pay-per-credit model without the complexity of proxy pool management.

Bright Data

Best for: Enterprise infrastructure, heavily protected sites, largest residential proxy network

Bright Data operates the largest commercial proxy network — 72M+ residential IPs across 195 countries. The platform serves enterprise data collection teams that need to bypass sophisticated anti-bot measures on the most protected sites on the internet (Amazon, LinkedIn, Zillow, Google).

Products and Pricing

ProductUse CaseStarting Price
Web UnlockerBypass anti-bot, CAPTCHA solving$1.50/1,000 requests
Scraping BrowserFull browser automation$1.50-10/1,000 requests
Residential ProxiesIP rotation network$8.50/GB
SERP APIStructured Google results$1.50/1,000 results
DatasetsPre-collected dataCustom

Web Unlocker API

import requests

# Web Unlocker — handles anti-bot, CAPTCHA, fingerprinting automatically
proxies = {
    "http": "http://brd-customer-hl_ACCOUNT_ID-zone-unlocker:PASSWORD@brd.superproxy.io:22225",
    "https": "http://brd-customer-hl_ACCOUNT_ID-zone-unlocker:PASSWORD@brd.superproxy.io:22225",
}

# Point any HTTP client at Bright Data's proxy
response = requests.get(
    "https://www.amazon.com/dp/B09XHVJ6Z3",
    proxies=proxies,
    verify=False,  # Bright Data uses its own SSL cert
)

html = response.text

Scraping Browser (Playwright)

from playwright.async_api import async_playwright

async with async_playwright() as pw:
    # Connect Playwright to Bright Data's scraping browser
    # Bright Data handles fingerprinting, CAPTCHA, and anti-bot
    browser = await pw.chromium.connect_over_cdp(
        f"wss://brd-customer-hl_{ACCOUNT_ID}-zone-scraping_browser:{PASSWORD}@brd.superproxy.io:9222"
    )

    page = await browser.new_page()
    await page.goto("https://www.linkedin.com/jobs/", timeout=60000)

    # Interact with page as normal Playwright code
    jobs = await page.query_selector_all(".job-card-container")
    for job in jobs:
        title = await job.query_selector(".job-card-list__title")
        print(await title.inner_text())

    await browser.close()

When to Choose Bright Data

Enterprise data collection teams scraping heavily protected sites (LinkedIn, Amazon, Zillow), organizations that need the highest-quality residential proxy coverage with 72M+ IPs, or teams building production-scale scrapers where reliability on protected sites justifies the premium pricing.

Apify

Best for: Pre-built scrapers for specific sites, actor marketplace, full-stack scraping platform

Apify is not just a proxy service — it's a full-stack web scraping platform. The Apify Store contains hundreds of pre-built "actors" (scrapers) for specific sites: Amazon products, Google Maps reviews, Instagram profiles, YouTube channels, LinkedIn companies, and hundreds more. You can run these actors without writing code, or build and host your own.

Pricing

PlanCostCompute Units/Month
Free$0$5 equivalent
Starter$49/month25,000 CUs
Scale$149/month100,000 CUs
Business$499/month400,000 CUs

Compute units are consumed based on memory × time. A 1GB scraper running for 1 minute = 60 CUs.

Pre-Built Actors (No-Code)

# Run Amazon scraper via Apify CLI
npx apify-cli run apify/amazon-scraper \
  --input='{"search": "bluetooth headphones", "maxResults": 100}' \
  --token=YOUR_APIFY_TOKEN

# Or via API
curl -X POST "https://api.apify.com/v2/acts/apify~amazon-scraper/runs" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "search": "bluetooth headphones",
    "maxResults": 100
  }'

Custom Actor (Playwright)

// src/main.ts — Apify actor
import { Actor } from "apify";
import { PlaywrightCrawler } from "crawlee";

await Actor.init();

const crawler = new PlaywrightCrawler({
  async requestHandler({ page, request, pushData }) {
    const title = await page.title();

    // Extract data from page
    const products = await page.$$eval(".product-item", (items) =>
      items.map((item) => ({
        name: item.querySelector(".name")?.textContent?.trim(),
        price: item.querySelector(".price")?.textContent?.trim(),
        url: item.querySelector("a")?.href,
      }))
    );

    await pushData({ url: request.url, title, products });
  },
});

await crawler.run(["https://example.com/products"]);

await Actor.exit();
# Deploy your custom actor to Apify cloud
npx apify-cli push

# Run it on demand or schedule
apify actor:run --memory=1024

When to Choose Apify

Teams that need pre-built scrapers for specific popular sites (Amazon, Google Maps, LinkedIn, Instagram) without writing custom code, developers who want a managed platform for hosting and scheduling scrapers, or organizations that need a marketplace of ready-to-use data extraction actors.

Oxylabs

Best for: SERP and e-commerce structured data, residential proxy network, data center proxies

Oxylabs is the competitor to Bright Data at the enterprise level — a large residential proxy network alongside purpose-built Web Scraper APIs that return structured data (JSON) rather than raw HTML.

Products and Pricing

ProductStarting PriceOutput
SERP API$49/17,500 resultsStructured JSON from Google
E-Commerce Scraper API$49/17,500 resultsProduct data JSON
Web Scraper API$49/17,500 resultsGeneral HTML or JSON
Residential Proxies$9/GBRaw proxy
Data Center Proxies$1.30/GBRaw proxy

SERP API

import requests

# Oxylabs SERP API — structured Google results
payload = {
    "source": "google_search",
    "query": "python web scraping",
    "domain": "com",
    "geo_location": "United States",
    "locale": "en-us",
    "parse": True,   # Return structured JSON, not raw HTML
    "pages": 3,
}

response = requests.post(
    "https://realtime.oxylabs.io/v1/queries",
    auth=("YOUR_USERNAME", "YOUR_PASSWORD"),
    json=payload,
)

data = response.json()
results = data["results"][0]["content"]["results"]["organic"]
for result in results:
    print(f"{result['pos']}. {result['title']}: {result['url']}")

E-Commerce Scraper

# Extract structured product data from Amazon
payload = {
    "source": "amazon_product",
    "query": "B09XHVJ6Z3",  # ASIN
    "parse": True,
}

response = requests.post(
    "https://realtime.oxylabs.io/v1/queries",
    auth=("USERNAME", "PASSWORD"),
    json=payload,
)

product = response.json()["results"][0]["content"]
print(f"Title: {product['title']}")
print(f"Price: {product['price']}")
print(f"Rating: {product['rating']}")
print(f"Reviews: {product['reviews_count']}")

When to Choose Oxylabs

Teams that need structured data extraction (JSON output) from e-commerce and SERP sources without parsing HTML, organizations that need enterprise-grade proxy infrastructure as a Bright Data alternative, or projects where Oxylabs' competitive pricing on residential proxies ($9/GB vs Bright Data's $8.50/GB) matters at scale.

Choosing the Right Tool

ScenarioRecommended
First-time scraper, simple pagesScrapingBee
Pre-built Amazon/Google scraperApify
Custom scraper, manage infrastructureApify
Heavily protected sites (LinkedIn)Bright Data Web Unlocker
Largest residential proxy poolBright Data
SERP data extraction (structured)Oxylabs SERP API
E-commerce price monitoringOxylabs or ScrapingBee
Enterprise volume, full controlBright Data
No-code scrapingApify (pre-built actors)

Self-Hosted Alternative

For teams with engineering capacity:

// Crawlee + Playwright — self-hosted scraper
import { PlaywrightCrawler, Dataset } from "crawlee";

const crawler = new PlaywrightCrawler({
  // Crawlee handles browser pool, request queue, retry logic
  maxConcurrency: 10,

  async requestHandler({ page, request, enqueueLinks, pushData }) {
    const title = await page.title();
    const data = await page.evaluate(() => {
      return Array.from(document.querySelectorAll(".product")).map(p => ({
        name: p.querySelector(".name")?.textContent,
        price: p.querySelector(".price")?.textContent,
      }));
    });

    await pushData({ url: request.url, title, data });
    await enqueueLinks({ selector: "a.pagination-next" }); // Follow pagination
  },
});

await crawler.run(["https://example.com/products"]);

Self-hosted costs: VPS for browser compute + residential proxy subscription. Viable for teams with engineering capacity; Crawlee (by Apify) is open-source.

Verdict

ScrapingBee is the right starting point for most development teams — one API, clear credit pricing, JavaScript rendering included, no proxy management.

Bright Data is the enterprise choice when you need to reliably scrape the most protected sites at scale. The 72M+ residential IP pool and the Scraping Browser give you capabilities that smaller networks can't match.

Apify is the choice when you want the scraping work already done — the actor marketplace provides production-ready scrapers for the most common sites, deployable without writing any code.

Oxylabs is the competitive alternative to Bright Data for e-commerce and SERP data, with structured JSON output APIs that skip the HTML parsing step.


Compare web scraping API pricing, proxy network quality, and site coverage at APIScout — find the right data extraction platform for your use case.

Comments