Features Full Page Screenshot Wait for Selector & Delay Block Cookie Banners Custom Viewport & Device Website to PDF HTML to Image MCP Server Use Cases Link Previews Visual Regression Testing Screenshots for AI Agents Pricing Docs Blog Log In Sign Up

Puppeteer Alternative for Screenshots — API vs DIY

If you're looking for a puppeteer alternative for screenshots, the short answer is: a managed screenshot API produces the same PNG or JPEG of a rendered web page, without the operational overhead. The difference between self-hosted Puppeteer and an API is everything that happens around that output. One gives you total control and total responsibility. The other trades control for not having to think about Chrome processes, memory limits, or 3 AM OOM kills. Both are valid choices, and the right one depends on your situation, not on which blog post you read last.

I built screenshotrun on top of Playwright (Puppeteer's younger sibling), so I know exactly what running browser infrastructure costs in time and attention. I also know there are cases where self-hosting is the smarter move. This puppeteer screenshot API comparison lays out the real tradeoffs — screenshot API vs self-hosted Puppeteer — so you can decide for yourself.

Week one is great, month three is not

Setting up Puppeteer takes an afternoon. You install the npm package, write a script, call page.screenshot(), and it works. The PNG comes back, the quality is fine, and you wonder why anyone would pay for an API when this took 40 lines of code.

Then the timeline starts:

Week 2-3: You notice some screenshots come back blank. The page uses client-side rendering and your script fires the capture before the content loads. You add waitUntil: 'networkidle0'. That fixes some pages and breaks others, because networkidle0 never resolves on sites with persistent WebSocket connections. I wrote a whole post about blank white screenshots in Puppeteer after debugging this pattern across dozens of sites.

Month 2: Memory usage climbs. Each Chromium instance pulls 300-800 MB of RAM depending on the page complexity. If you're processing a queue, the old browser contexts don't always release cleanly. The process grows until the OS kills it. You add browser.close() in a finally block, then realize you also need to handle the case where browser.close() itself hangs.

Month 3: Cookie consent banners cover 40% of every European site you screenshot. You start maintaining a list of selectors to click "Accept" or dismiss the overlay. That list grows every week. GDPR banners use different frameworks (OneTrust, Cookiebot, custom implementations), and each one needs its own handling. I ended up building automatic cookie banner blocking into the API because maintaining those selectors by hand was eating hours every month.

Month 4+: You're spending 4-8 hours per month on maintenance. Not building features, not shipping product. Babysitting a Chrome process. One developer on DEV Community documented spending 40 hours and $2,700 in engineering time to save $116 on API costs. That's an extreme case, but the direction is right.

Same screenshot, different amount of code

The most honest comparison is code. Here's the same task in both approaches: take a full-page screenshot of a URL at a mobile viewport, block cookie banners, and wait for a specific element to load before capturing.

Puppeteer (self-hosted):

const puppeteer = require('puppeteer');

async function captureScreenshot(url) {
  const browser = await puppeteer.launch({
    headless: 'new',
    args: [
      '--no-sandbox',
      '--disable-setuid-sandbox',
      '--disable-dev-shm-usage',
      '--disable-gpu',
    ],
  });

  try {
    const page = await browser.newPage();
    await page.setViewport({ width: 390, height: 844 });

    // Block cookie consent banners (partial list)
    await page.evaluateOnNewDocument(() => {
      const observer = new MutationObserver(() => {
        const selectors = [
          '#onetrust-banner-sdk',
          '.cc-banner',
          '#cookie-consent',
          '[class*="cookie-banner"]',
          '#CybotCookiebotDialog',
        ];
        selectors.forEach(sel => {
          document.querySelectorAll(sel).forEach(el => el.remove());
        });
      });
      observer.observe(document.documentElement, {
        childList: true, subtree: true
      });
    });

    await page.goto(url, {
      waitUntil: 'networkidle2',
      timeout: 30000,
    });

    // Wait for specific element
    await page.waitForSelector('.main-content', { timeout: 10000 });

    // Extra delay for animations and lazy images
    await new Promise(r => setTimeout(r, 2000));

    const screenshot = await page.screenshot({
      fullPage: true,
      type: 'png',
    });

    return screenshot;
  } finally {
    await browser.close();
  }
}

That's 45 lines, and it's still fragile. The cookie selector list is incomplete. The 2-second delay is a guess. There's no retry logic, no timeout handling for browser.close(), no concurrency management. In production, you'd add another 30-50 lines for error handling alone.

Screenshot API (managed):

curl -X POST https://screenshotrun.com/api/v1/screenshots \
  -H "Authorization: Bearer sr_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "width": 390,
    "height": 844,
    "full_page": true,
    "format": "png",
    "block_cookies": true,
    "wait_for_selector": ".main-content",
    "delay": 2
  }'

Five parameters replace 45 lines of browser management code. Cookie blocking pulls from a maintained selector database instead of a hardcoded list. wait_for_selector handles the element wait, and the browser pool, memory management, and error recovery all happen on the API side.

Headless Chrome vs screenshot API: feature comparison

Capability Self-hosted Puppeteer/Playwright Managed screenshot API
Viewport and device emulation Full control, any size Any size via parameters
Full-page capture Built-in, but lazy images often blank Handled with auto-scroll and wait logic
Cookie banner blocking DIY selector list (you maintain it) Built-in, maintained database
Wait conditions waitForSelector, networkidle wait_for_selector, delay
Login / authenticated pages Full cookie and session control Limited (API key auth only)
PDF generation page.pdf() Via format parameter
Custom JavaScript injection Full page.evaluate() access Not supported
Network interception Full request/response control Not supported
Geo-targeted captures Requires proxy setup Depends on API provider
Concurrency Limited by RAM (3-5 per GB) Handled by API infrastructure
Retries on failure You build it Built-in
Output formats PNG, JPEG, WebP, PDF PNG, JPEG, WebP, PDF

The self-hosted column wins on flexibility. If you need to inject JavaScript, intercept network requests, or handle complex login flows with multi-step authentication, Puppeteer gives you access to the full browser API. A managed API can't match that level of control.

Total cost at different volumes

Cost comparisons that ignore engineering time are misleading. A $5/month server sounds cheap until you account for the 4-8 hours per month you spend keeping Chrome alive on it. I'm using $75/hour as a conservative engineering cost (adjust for your situation).

Monthly volume Self-hosted cost API cost (typical) Notes
1,000 screenshots $5 server + $300-600 eng. time $0-15 API wins. Self-hosting is all overhead at this volume.
5,000 screenshots $20 server + $300-600 eng. time $25-50 Roughly breakeven on dollars, but API saves the eng. hours.
10,000 screenshots $40 server + $300-600 eng. time $50-100 Self-hosting starts to make financial sense if you have DevOps capacity.
50,000 screenshots $100-200 server + $450-900 eng. time $200-500 Self-hosting wins on cost if your infra team already exists.
100,000+ screenshots $200-500 server + $600-1200 eng. time $500-1000+ At this scale, a dedicated rendering cluster pays for itself.

The engineering time doesn't scale linearly with volume, but it doesn't go to zero either. At 50,000+ captures per month, the per-screenshot cost of self-hosting drops enough that the server savings outweigh the maintenance hours. Below 10,000 per month, the math almost always favors an API.

I go deeper into the financial breakdown in my post on when to build and when to buy.

The serverless trap

Running Puppeteer on AWS Lambda or Vercel serverless functions sounds like the best of both worlds: no server to manage, pay-per-invocation pricing, automatic scaling. In practice, it's the worst of both.

Chromium doesn't fit comfortably in a serverless environment. The Docker image runs 600-900 MB. Cold starts take 2-4 seconds before your script even begins executing. The chrome-aws-lambda package that made this workable has been abandoned by its maintainer. Its successor, @sparticuz/chromium, works but requires careful version pinning and has its own set of quirks.

Lambda's 512 MB default memory limit isn't enough for most pages. You'll bump it to 1-2 GB, which changes the pricing math entirely. And you still get the headless Chrome connection errors in Docker that plague containerized Chromium everywhere, plus Lambda-specific issues like /tmp running out of space when Chromium writes its cache.

If you want serverless-style pricing without running Chrome yourself, that's exactly what a managed screenshot API vs Puppeteer on Lambda offers: the same result without the container pain.

Security: the part nobody mentions

When you run Puppeteer on your server and accept URLs from users (or from your own queue that processes user-submitted data), you're running a full browser that can make requests to any address. That includes your internal network.

SSRF (Server-Side Request Forgery) through a headless browser is a real attack vector. Someone submits http://169.254.169.254/latest/meta-data/ as the URL, and your Puppeteer instance happily screenshots your AWS instance metadata, including IAM credentials. Or they submit a URL pointing to your internal admin panel.

Sandboxing this properly means URL validation, network-level restrictions, running Chrome in a separate network segment, and keeping up with Chrome's own sandbox escape CVEs. A managed API handles this isolation for you, which is one less attack surface to maintain.

When self-hosting is the right call

I'd choose self-hosted Puppeteer or Playwright in these situations:

Authenticated pages are the clearest case. If the target requires login cookies, session tokens, or multi-step authentication, you need full browser control. APIs can't handle your custom auth flow.

JavaScript injection and network interception fall into the same bucket. Testing flows that require page.evaluate() to modify the page, or intercepting requests to mock API responses, need the full Puppeteer/Playwright API. Visual regression testing in CI/CD pipelines often falls into this category, where you're running Playwright against your own staging environment.

At high volumes — 50,000+ captures per month — a dedicated rendering cluster with proper monitoring costs less than API pricing, provided you have the DevOps capacity to maintain it.

And at the other extreme, fewer than 100 screenshots per month for personal use, a quick script is simpler than signing up for anything.

When a Puppeteer alternative for screenshots makes more sense

I'd reach for an API in these cases:

Production features that serve end users are the strongest case. Link previews, social cards, PDF reports, thumbnail generation. These features need to work reliably at unpredictable volumes. A page that doesn't fully load before capture means a broken experience for your user, not just a failed test.

Small teams without dedicated DevOps benefit the most. If your team is two backend developers and a designer, spending 4-8 hours per month on Chrome infrastructure is time you can't afford. The API turns that into zero hours.

AI agent tooling is another natural fit. Agents need reliable, fast screenshot capture as a tool call. They don't benefit from the flexibility of page.evaluate(). They need a URL in, image out, every time. I wrote about this pattern in detail for feeding screenshots to AI agents.

Third-party sites you don't control are where self-hosting hurts the most. These sites change their markup, add new cookie frameworks, update their anti-bot detection. A managed API absorbs that maintenance while your code stays the same.

The hybrid approach: Puppeteer in dev, API in production

This is what I actually recommend for most teams, and it's what I don't see anyone else suggesting. Use both.

Run Playwright in your CI pipeline for visual regression tests against your own staging environment. You control the pages, you need JavaScript injection for test setup, and the volume is low (a few hundred captures per deploy). Playwright is the right tool here.

Use a screenshot API for your production features. Link previews, OG image generation, customer-facing reports, competitor monitoring dashboards, AI agent integrations. These need reliability at variable scale, and the API handles cookie banners, wait conditions, and rendering quirks across thousands of different sites you've never seen before.

The two approaches don't compete. They solve different problems. Your CI pipeline and your production screenshot feature have different requirements, and pretending one tool serves both is how you end up with a fragile Puppeteer cluster in production or an expensive API bill for your test suite.

Puppeteer and Playwright, not just Puppeteer

Most comparison pages only mention Puppeteer, but Playwright has largely replaced it for new projects. Playwright supports Chromium, Firefox, and WebKit from a single API. Its auto-wait mechanism reduces the timing issues that cause Puppeteer Target closed protocol errors. And Microsoft's backing means it gets consistent updates.

If you do go the self-hosted route, start with Playwright, not Puppeteer. The API is cleaner, the documentation is better, and you'll spend less time fighting the browser lifecycle. Everything in the "self-hosting" sections above applies to both libraries. The tradeoffs against a managed API are the same regardless of which one you pick.

Most teams hit the same wall at month three

The honest answer is that there's no universal right choice. If you have the infrastructure team, the time budget, and requirements that need full browser control, self-hosting gives you flexibility that no API can match. If you want screenshots as a feature in your product without adopting Chrome as a dependency, an API removes an entire category of operational work.

Most projects I've seen land somewhere in the middle. They start with a quick Puppeteer script, hit the maintenance wall around month three, and either invest in proper infrastructure or replace Puppeteer with a screenshot API. Knowing that timeline in advance lets you make the choice deliberately instead of reactively.

Skip the Chrome infrastructure

Try the API — 200 free screenshots/month

Frequently asked questions

Self-hosted Puppeteer or Playwright is the better choice when you need to screenshot pages behind authentication (custom login flows, session cookies), when you need JavaScript injection or network interception for testing, when your volume exceeds 50,000 captures per month and you have DevOps capacity to maintain the infrastructure, or when you're capturing a handful of screenshots for personal use. In these cases, the full browser API gives you control that a managed service can't match.
At low volumes (under 5,000/month), self-hosted Puppeteer costs $5-20 in server fees plus $300-600 in monthly engineering time for maintenance — memory leaks, cookie banner updates, Chrome crashes. A screenshot API costs $0-50 for the same volume with zero maintenance. The breakeven point is around 10,000-50,000 captures per month, depending on whether you have existing DevOps capacity. Above 50,000/month, self-hosting can be cheaper on pure infrastructure cost.
You can, but it comes with significant friction. Chromium Docker images run 600-900 MB, cold starts add 2-4 seconds, and the widely-used chrome-aws-lambda package has been abandoned. Lambda's default 512 MB memory limit isn't enough for most pages, so you'll need 1-2 GB allocations which changes the pricing math. The successor package (@sparticuz/chromium) works but requires careful version pinning. For serverless-style pricing without managing Chrome, a screenshot API is a more practical fit.
For new projects, Playwright is the stronger choice. It supports Chromium, Firefox, and WebKit from a single API, has better auto-wait mechanisms that reduce timing-related blank screenshots, and gets consistent updates from Microsoft. Playwright's API is cleaner for screenshot work specifically. That said, the tradeoffs between self-hosting and using a managed API are identical for both libraries — the maintenance burden of running browser infrastructure is the same regardless of which library controls the browser.
The main risk is SSRF (Server-Side Request Forgery). When your Puppeteer instance accepts URLs from users or user-influenced queues, it can be tricked into requesting internal network addresses — including AWS instance metadata endpoints that expose IAM credentials, or internal admin panels. Proper sandboxing requires URL validation, network-level restrictions, running Chrome in an isolated network segment, and staying current on Chrome sandbox escape CVEs. A managed screenshot API handles this isolation on its infrastructure, removing that attack surface from your servers.