Features Pricing Docs Blog Try Demo Log In Sign Up
Back to Blog

How to cache screenshots and stop paying for the same capture twice

About 30-40% of screenshot API requests are duplicates — same URL, same parameters, same result. Here's how I built caching into screenshotrun and three strategies you can use on your side to cut your API bill and speed up delivery: TTL-based cache, content hashing, and event-driven refresh via webhooks. Code examples in PHP/Laravel and Node.js included.

How to cache screenshots and stop paying for the same capture twice

Every time I look at screenshotrun logs, same story: about 30-40% of requests are repeated captures of the same URLs. Same site, same parameters, same result. But each one spins up a new Playwright render, creates a new file, burns through more credits.

When I was building screenshotrun, caching was one of the first things I added. Not because it's hard to implement — but because without it, the API turns into an expensive toy for both me and my users.

In this article I'll walk through how I set up caching on the API side, what caching strategies actually work on the client side, and share code you can drop into your own project.

Why cache screenshots at all

Taking a website screenshot is not a cheap operation. Every request means launching a headless browser, loading the page, waiting for the render to finish, converting to PNG or JPEG. That takes anywhere from 2 to 10 seconds depending on the page.

If your app shows website previews in a link directory or generates OG images, every visitor triggers that whole process from scratch. A hundred visitors per hour means a hundred identical renders. Your API bill goes up, response times get worse, and the output is the same every time.

Caching fixes this on multiple fronts: it saves API credits, brings response times down from seconds to milliseconds, and takes pressure off the entire chain.

Two levels of cache: API-side and client-side

I think of screenshot caching as two separate layers, and both are worth using.

API-side means the screenshot API itself stores the result and returns it again if the parameters match. In screenshotrun, I built this with a cache_ttl parameter. You pass a number of seconds, and during that window any repeat request with the same parameters returns the cached file — no new render.

Here's a request with a 24-hour cache:

curl "https://screenshotrun.com/api/v1/screenshots/capture?url=https://example.com&format=png&cache_ttl=86400&response_type=image" \
  -H "Authorization: Bearer sk_live_your_key"

The first call renders the screenshot and stores it. Every call after that within the next 24 hours returns the cached version. Same file, but the response comes back in 100-200ms instead of 3-5 seconds.

Client-side means you save the screenshot yourself after the first request and stop calling the API entirely. This gives you full control — you decide when to refresh, where to store, and how to serve files to your users.

Both layers work well together. API cache protects against duplicates, while your own cache removes the API from the chain completely for screenshots you already have.

Strategy 1: TTL cache (the simplest approach)

TTL stands for time to live. The idea is straightforward: a screenshot lives for N seconds, then it's considered stale. For most use cases, this is enough.

Here's how I'd implement this on the client side with a database:

// Laravel: check if we have a fresh cache entry
$cached = ScreenshotCache::where('url', $url)
    ->where('params_hash', md5(json_encode($params)))
    ->where('expires_at', '>', now())
    ->first();
 
if ($cached) {
    return $cached->file_path; // serve from storage
}
 
// No cache or expired — make the API call
$response = Http::withToken(config('services.screenshotrun.key'))
    ->get('https://screenshotrun.com/api/v1/screenshots/capture', [
        'url' => $url,
        'format' => 'png',
        'width' => 1280,
        'response_type' => 'image',
    ]);
 
// Save the file
$path = "screenshots/" . md5($url . json_encode($params)) . ".png";
Storage::disk('s3')->put($path, $response->body());
 
// Write to the cache table
ScreenshotCache::updateOrCreate(
    ['url' => $url, 'params_hash' => md5(json_encode($params))],
    ['file_path' => $path, 'expires_at' => now()->addHours(24)]
);
 
return $path;

The important part here is params_hash. A screenshot of the same URL at width 1280 and width 1920 are two different screenshots. Hashing the parameters makes sure the cache only hits when everything matches exactly.

The migration is a simple table:

Schema::create('screenshot_caches', function (Blueprint $table) {
    $table->id();
    $table->string('url', 2048);
    $table->string('params_hash', 32)->index();
    $table->string('file_path');
    $table->timestamp('expires_at')->index();
    $table->timestamps();
 
    $table->unique(['url', 'params_hash']);
});

The index on expires_at comes in handy later when you need to clean up stale entries on a schedule.

Strategy 2: Content hash — only re-capture when the site changes

TTL works well, but sometimes you want to be more precise. Why re-render a screenshot after 24 hours if the site hasn't changed in a week?

I tried a different approach: check whether the page content actually changed before triggering a new render. The idea is to hash the HTML and compare it against what you have stored.

// Get the current content hash
$html = Http::get($url)->body();
$contentHash = md5($html);
 
$cached = ScreenshotCache::where('url', $url)
    ->where('params_hash', md5(json_encode($params)))
    ->first();
 
// Content hasn't changed — cache is still good
if ($cached && $cached->content_hash === $contentHash) {
    return $cached->file_path;
}
 
// Content changed — take a new screenshot
$screenshot = $this->captureScreenshot($url, $params);
 
$cached?->update([
    'file_path' => $screenshot->path,
    'content_hash' => $contentHash,
]) ?? ScreenshotCache::create([
    'url' => $url,
    'params_hash' => md5(json_encode($params)),
    'file_path' => $screenshot->path,
    'content_hash' => $contentHash,
    'expires_at' => now()->addDays(30),
]);

To be honest, this approach has a weak spot. The HTTP request to fetch the page isn't a full render, but it's still a network call. And HTML hashing doesn't always reflect visual changes — the CSS might update, or a different banner could load via JavaScript. In those cases the HTML stays the same, but the screenshot looks different.

I use content hashing together with TTL: I check the content hash no more than once per hour, and I set the TTL to 7 days as a safety net. That gives me a reasonable balance between freshness and savings.

Strategy 3: Event-driven — refresh on webhook

If you control the site you're screenshotting, there's an even simpler option: refresh the screenshot on a specific event. You deploy a new version, fire a webhook, the screenshot gets re-captured.

// routes/api.php
Route::post('/webhooks/screenshot-refresh', function (Request $request) {
    $url = $request->input('url');
 
    // Invalidate the cache
    ScreenshotCache::where('url', $url)->delete();
 
    // Re-capture in the background
    CaptureScreenshotJob::dispatch($url);
 
    return response()->json(['status' => 'queued']);
});

On the CI/CD side, it's a single curl call after deploy:

# GitHub Actions
- name: Refresh screenshot
  run: |
    curl -X POST https://your-app.com/webhooks/screenshot-refresh \
      -H "Content-Type: application/json" \
      -d '{"url": "https://your-site.com"}'

This works great for OG images of your own site or for situations where you know exactly when the content was updated. For directories full of third-party sites — not so much, since you don't control their deploy schedule.

Where to store cached screenshots

The local filesystem is fine to start with, but once you need to scale, object storage is the better choice. I went with Hetzner Object Storage — it's S3-compatible, costs almost nothing, and the latency from Helsinki works for my setup.

// config/filesystems.php
'screenshot_cache' => [
    'driver' => 's3',
    'key' => env('HETZNER_S3_KEY'),
    'secret' => env('HETZNER_S3_SECRET'),
    'region' => 'eu-central',
    'bucket' => env('HETZNER_S3_BUCKET'),
    'endpoint' => env('HETZNER_S3_ENDPOINT'),
    'use_path_style_endpoint' => true,
],

If you're serving screenshots directly to users (say, as OG images), put a CDN in front of your storage. Cloudflare caches static assets on edge servers for free — and that way even requests to your object storage drop to near zero.

The chain looks like this: user → CDN (Cloudflare) → Object Storage (Hetzner) → Screenshot API (screenshotrun). Each layer only fires when the one before it misses.

Cleaning up expired cache

A cache without cleanup turns into a pile of junk. I run a simple artisan command on a cron schedule:

// app/Console/Commands/CleanExpiredScreenshots.php
class CleanExpiredScreenshots extends Command
{
    protected $signature = 'screenshots:clean';
 
    public function handle()
    {
        $expired = ScreenshotCache::where('expires_at', '<', now())->get();
 
        foreach ($expired as $cache) {
            Storage::disk('screenshot_cache')->delete($cache->file_path);
            $cache->delete();
        }
 
        $this->info("Cleaned {$expired->count()} expired screenshots.");
    }
}
# crontab
0 * * * * cd /var/www/app && php artisan screenshots:clean

Once per hour is plenty. If you end up with millions of records, add chunk() and cap the number of deletions per run so you don't hammer the database.

Node.js version: same idea, file-based cache

If you're not on Laravel, here's a minimal cache implementation in Node.js using the filesystem:

import fs from 'fs/promises';
import crypto from 'crypto';
import path from 'path';
 
const CACHE_DIR = './screenshot-cache';
const CACHE_TTL = 24 * 60 * 60 * 1000; // 24 hours in ms
 
async function getScreenshot(url, params = {}) {
  const cacheKey = crypto
    .createHash('md5')
    .update(url + JSON.stringify(params))
    .digest('hex');
  const cachePath = path.join(CACHE_DIR, `${cacheKey}.png`);
 
  // Check cache
  try {
    const stat = await fs.stat(cachePath);
    if (Date.now() - stat.mtimeMs < CACHE_TTL) {
      return cachePath; // cache is fresh
    }
  } catch {
    // file doesn't exist — move on
  }
 
  // Request the screenshot
  const searchParams = new URLSearchParams({
    url,
    format: 'png',
    response_type: 'image',
    ...params,
  });
 
  const response = await fetch(
    `https://screenshotrun.com/api/v1/screenshots/capture?${searchParams}`,
    { headers: { Authorization: `Bearer ${process.env.SCREENSHOTRUN_KEY}` } }
  );
 
  // Save to cache
  await fs.mkdir(CACHE_DIR, { recursive: true });
  const buffer = Buffer.from(await response.arrayBuffer());
  await fs.writeFile(cachePath, buffer);
 
  return cachePath;
}

Same principle: hash the URL and parameters, check file modification time, create the directory lazily. For production you'd want to swap local files for something like S3.

How to pick the right TTL

There's no universal answer, but here's what I've found works in practice.

5-15 minutes makes sense for dashboards and pages with live data. Stock tickers, real-time stats, live scores. The cache here is just to prevent the same screenshot from being rendered 50 times per minute.

1-6 hours fits news sites, feeds, and blogs. Content updates a few times a day, but minute-level freshness doesn't matter for a screenshot.

24 hours is the sweet spot for most use cases. Link directories, OG images, in-app previews. Sites don't change as often as you'd think.

7 days works for stable pages. Documentation, landing pages, corporate sites. If you know the content updates once a week, there's no point screenshotting more often.

In screenshotrun, the default cache duration is 7 days. I landed on that number after looking at the logs — the average site on the internet changes its visually meaningful content roughly once a week.

A mistake I made: caching errors

One thing that tripped me up early on — I accidentally cached error responses. A site was down, the API returned an empty response, and I happily stored that in the cache for 24 hours.

Always check the response status before caching:

$response = Http::withToken($apiKey)
    ->get('https://screenshotrun.com/api/v1/screenshots/capture', $params);
 
// Don't cache errors
if (!$response->successful()) {
    Log::warning("Screenshot failed for {$url}: {$response->status()}");
    return null;
}
 
// Make sure the file isn't suspiciously small
if (strlen($response->body()) < 1000) {
    Log::warning("Screenshot suspiciously small for {$url}");
    return null;
}
 
// Now it's safe to cache
Storage::disk('screenshot_cache')->put($path, $response->body());

The size check is a quick way to catch broken screenshots. A normal PNG screenshot of a website weighs at least a few tens of kilobytes. If the file is under a kilobyte, something went wrong.

Wrapping up

Caching screenshots isn't complicated, but without it you keep paying for the same work over and over. The short version: turn on API-side caching with cache_ttl in screenshotrun (zero code required), add your own cache layer with S3 or local storage for full control, pick a TTL that matches how often your target sites actually change, and don't forget to clean up expired entries and validate responses before storing them.

If you don't feel like building your own caching system, screenshotrun has it built in — just pass cache_ttl and the API handles the rest. The free tier is enough to try it out.

Good luck with your screenshots — and may your API bill stay as small as your TTL for live dashboards.

More from the blog

View all posts
When a Screenshot Tells You What a Log Can't: 5 Situations That Matter

When a Screenshot Tells You What a Log Can't: 5 Situations That Matter

Logs record what the system did. Screenshots show what the user saw. The difference seems obvious — but a lot of teams quietly lose this information without noticing. Here are five situations in product, marketing, and client work where a screenshot gives you the answer a log simply can't.

Read more →
How to add website preview thumbnails to your link directory with a screenshot API

How to add website preview thumbnails to your link directory with a screenshot API

Learn how to automatically generate website preview thumbnails for your link directory using a screenshot API. Step-by-step PHP and Node.js code with caching, real output screenshots, and tips for handling cookie banners and large directories.

Read more →
How to take a website screenshot with Python

How to take a website screenshot with Python

Learn how to capture website screenshots with Python using three approaches: Selenium, Playwright, and a screenshot API. Step-by-step code, real output screenshots, full-page captures, mobile viewports, and honest comparison of pros and cons for each method.

Read more →