April 13, 2026

How to take website screenshots with Ruby — Selenium, Ferrum, and API

Ruby doesn't ship with a browser rendering engine, so taking website screenshots requires an external tool. This article covers three approaches — Selenium WebDriver with headless Chrome, Ferrum via DevTools Protocol, and the Screenshotrun API — with working code and a production comparison.

Ruby doesn't ship with a built-in browser rendering engine, so taking a website screenshot in a single line of code isn't going to happen. In practice you have three working options: spin up headless Chrome through Selenium WebDriver, use the lighter Ferrum gem (it talks to Chrome directly via the DevTools Protocol), or skip browser infrastructure entirely and call a screenshot API over HTTP. I've put together working code for all three so you can pick the one that fits your project.

Option 1: Selenium WebDriver with headless Chrome

Selenium is the most widely known browser automation tool in the Ruby ecosystem. It controls a real Chrome instance through ChromeDriver, and the rendering comes out accurate. The downside is that you need to install ChromeDriver and keep its version in sync with the Chrome you have installed.

How to install Selenium dependencies

gem install selenium-webdriver webdrivers

The webdrivers gem downloads the right ChromeDriver version and matches it to your installed Chrome automatically. But Chrome itself still has to be on the machine.

Basic screenshot with Selenium WebDriver

require 'selenium-webdriver'
require 'webdrivers'

options = Selenium::WebDriver::Chrome::Options.new
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--window-size=1280,800')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

driver = Selenium::WebDriver.for :chrome, options: options
driver.navigate.to 'https://example.com'

sleep 2  # give JavaScript time to finish rendering

driver.save_screenshot('screenshot.png')
driver.quit

puts 'Saved: screenshot.png'

That --disable-dev-shm-usage flag matters in containers like Docker, where shared memory is limited and Chrome can crash without it. And sleep 2 is a rough wait. For pages with heavy JavaScript you may need more time, or a smarter waiting strategy altogether.

Full-page screenshot in Selenium (with a workaround)

By default Selenium captures only the visible viewport. There's no built-in option for the full page, so the standard workaround is to resize the browser window to the full document height before taking the shot:

require 'selenium-webdriver'
require 'webdrivers'

options = Selenium::WebDriver::Chrome::Options.new
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')

driver = Selenium::WebDriver.for :chrome, options: options
driver.navigate.to 'https://example.com'
sleep 2

total_height = driver.execute_script('return document.body.scrollHeight')
driver.manage.window.resize_to(1280, total_height)

sleep 1  # wait for reflow
driver.save_screenshot('fullpage.png')
driver.quit

This works fine on most pages, but it can break on sites with sticky headers, fixed-position elements, or content that loads dynamically as you scroll.

Option 2: Ferrum, headless Chrome without the extra dependencies

Ferrum is a pure Ruby gem that talks to Chrome directly through the Chrome DevTools Protocol. No Selenium, no ChromeDriver binary, no Java dependency. It's simpler to set up, and when you need it, it gives you more precise control over the browser.

Installing Ferrum

gem install ferrum

Chrome needs to be installed on the machine. Ferrum finds it automatically on macOS, Linux, and Windows without any extra configuration.

Basic screenshot with Ferrum

require 'ferrum'

browser = Ferrum::Browser.new(
  headless: true,
  window_size: [1280, 800]
)

browser.go_to('https://example.com')
browser.network.wait_for_idle

browser.screenshot(path: 'screenshot.png')
browser.quit

puts 'Saved: screenshot.png'

The network.wait_for_idle method waits until the network goes quiet, which is much more reliable than an arbitrary sleep, especially on JavaScript-heavy pages. It tracks pending requests through the DevTools Network domain and holds until they all settle.

Full-page screenshot in Ferrum (one line does it)

require 'ferrum'

browser = Ferrum::Browser.new(headless: true, window_size: [1280, 800])
browser.go_to('https://example.com')
browser.network.wait_for_idle

browser.screenshot(path: 'fullpage.png', full: true)
browser.quit

The full: true flag is built into Ferrum. It captures the entire scrollable page automatically, without the window-resizing tricks Selenium needs. This is one of the clear advantages Ferrum has for screenshot-oriented work.

Getting a screenshot as a Base64 string

If you need the image in memory rather than saved to disk (say, to upload it straight to cloud storage), Ferrum can return the result as a Base64 string:

require 'ferrum'
require 'base64'

browser = Ferrum::Browser.new(headless: true, window_size: [1280, 800])
browser.go_to('https://example.com')
browser.network.wait_for_idle

base64_image = browser.screenshot(encoding: :base64)
image_data   = Base64.decode64(base64_image)

# upload_to_storage(image_data)

browser.quit

Option 3: Screenshot API with no browser on your server

Both Selenium and Ferrum need Chrome running somewhere: locally, on your server, or in a container. In production that means extra memory (Chrome easily eats 300–500 MB per instance), keeping browser and driver versions in sync, and dealing with rendering problems that only show up under real conditions. I built Screenshotrun specifically to remove all of that overhead. I wrote up the full reasoning in a separate post on when to build your own screenshot stack vs when to use an API.

Screenshotrun renders pages with Playwright on its own servers and returns the result over HTTP. Your Ruby code just makes an API call. No browser process, no infrastructure to maintain.

Using net/http (no extra dependencies)

Here's a complete working example using only Ruby's standard library:

require 'net/http'
require 'json'
require 'uri'

API_KEY    = 'YOUR_API_KEY'
TARGET_URL = 'https://example.com'
BASE_URL   = 'https://api.screenshotrun.com/v1'

def request_headers
  {
    'Authorization' => "Bearer #{API_KEY}",
    'Content-Type'  => 'application/json'
  }
end

def http_for(url)
  uri  = URI(url)
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  [http, uri]
end

# Step 1: Create the screenshot job
def create_screenshot(target_url)
  http, uri = http_for("#{BASE_URL}/screenshots")
  req = Net::HTTP::Post.new(uri.path, request_headers)
  req.body = JSON.generate(
    url:             target_url,
    format:          'png',
    viewport_width:  1280,
    viewport_height: 800
  )
  JSON.parse(http.request(req).body)
end

# Step 2: Poll until the status is completed
def wait_for_screenshot(id, max_attempts: 20, interval: 2)
  max_attempts.times do
    http, uri = http_for("#{BASE_URL}/screenshots/#{id}")
    req  = Net::HTTP::Get.new(uri.path, request_headers)
    data = JSON.parse(http.request(req).body)['data']

    return data if data['status'] == 'completed'
    raise "Screenshot failed: #{data.dig('error', 'message')}" if data['status'] == 'failed'

    sleep interval
  end
  raise 'Timed out waiting for screenshot'
end

# Step 3: Download the image
def download_image(image_url, output_path)
  http, uri = http_for(image_url)
  req = Net::HTTP::Get.new(uri.request_uri, request_headers)
  File.binwrite(output_path, http.request(req).body)
end

# Run it
result    = create_screenshot(TARGET_URL)
id        = result['data']['id']
puts "Queued: #{id}"

completed = wait_for_screenshot(id)
download_image(completed['links']['image'], 'screenshot.png')
puts "Saved: screenshot.png (#{completed['width']}x#{completed['height']})"

Same example with HTTParty (more compact syntax)

If httparty is already in your project, the same flow is shorter:

gem install httparty

require 'httparty'

API_KEY  = 'YOUR_API_KEY'
BASE_URL = 'https://api.screenshotrun.com/v1'
HEADERS  = {
  'Authorization' => "Bearer #{API_KEY}",
  'Content-Type'  => 'application/json'
}

# Create screenshot
res = HTTParty.post(
  "#{BASE_URL}/screenshots",
  headers: HEADERS,
  body: { url: 'https://example.com', format: 'png', viewport_width: 1280, viewport_height: 800 }.to_json
)

id = res.parsed_response['data']['id']
puts "Queued: #{id}"

# Poll until ready
completed = nil
20.times do
  data = HTTParty.get("#{BASE_URL}/screenshots/#{id}", headers: HEADERS).parsed_response['data']
  break completed = data if data['status'] == 'completed'
  raise "Failed: #{data.dig('error', 'message')}" if data['status'] == 'failed'
  sleep 2
end

raise 'Timed out' unless completed

# Download
image = HTTParty.get(completed['links']['image'], headers: HEADERS)
File.binwrite('screenshot.png', image.body)
puts "Saved: screenshot.png"

Additional API parameters

body: {
  url:             'https://example.com',
  format:          'png',       # png, jpeg, webp, pdf
  viewport_width:  1280,
  viewport_height: 800,
  full_page:       true,        # capture the full scrollable page
  dark_mode:       true,        # emulate prefers-color-scheme: dark
  block_ads:       true,        # block ads and trackers
  delay:           2000,        # wait N ms after page load (milliseconds)
  selector:        '#main',     # screenshot a specific CSS selector
}.to_json

A few of these parameters have their own articles worth linking out to: element screenshots via selector, dark mode capture via dark_mode, and PDF output via format: "pdf" are each covered in more depth there.

For production workloads, consider using webhooks instead of polling. Webhooks push the completed result to your server, which is more efficient than repeatedly checking the status.

Comparing Selenium, Ferrum, and a screenshot API

Feature	Selenium	Ferrum	Screenshotrun API
Setup complexity	Medium (Chrome + ChromeDriver)	Low (Chrome only)	Minimal (API key only)
Ruby dependencies	selenium-webdriver, webdrivers	ferrum	net/http (stdlib) or httparty
Full-page screenshots	Manual workaround	Built-in (`full: true`)	Built-in (`full_page: true`)
Server memory usage	High (Chrome process)	High (Chrome process)	None (rendering is server-side)
Works in serverless / PaaS	Needs custom buildpack	Needs custom buildpack	Yes, out of the box
PDF export	Complex CDP setup	Built-in (`format: :pdf`)	Built-in (`format: "pdf"`)
Dark mode / Retina	Manual Chrome flags	Partial support	Native parameters
JS-heavy SPA rendering	Good (with manual waits)	Good (network idle wait)	Excellent (Playwright-powered)
Ad / cookie banner blocking	Requires extension setup	Manual JS injection	Native (`block_ads: true`)
Free tier	Unlimited (self-hosted)	Unlimited (self-hosted)	200 screenshots/month, no card

Running Chrome in Docker and on Heroku: what to expect

If you're deploying Selenium or Ferrum in a containerized environment, getting Chrome to run takes extra steps. In Docker you need to install Chrome and all its system dependencies right in your image:

FROM ruby:3.3

RUN apt-get update && apt-get install -y \
  chromium \
  chromium-driver \
  fonts-liberation \
  libasound2 \
  libatk-bridge2.0-0 \
  libgtk-3-0 \
  libnss3 \
  libxss1 \
  && rm -rf /var/lib/apt/lists/*

On Heroku you'll need separate buildpacks for Chrome and ChromeDriver. And on serverless platforms things get even messier because of binary size limits and read-only filesystems. With the API none of that matters. Your Ruby code makes HTTP requests, and Chrome lives on someone else's server.

Concurrency: why parallel screenshots get expensive fast

Every Selenium or Ferrum instance is a separate Chrome process. Two parallel screenshots mean two processes at 300–500 MB of RAM each. Ten of them and you're already burning several gigabytes. If your app needs screenshots for many users at once, managing a browser pool turns into its own engineering problem. With the API, concurrent requests are just concurrent HTTP calls, and your server's memory stays untouched.

Error handling: the difference between self-hosted and API

Pages that never finish loading, JavaScript errors, DNS failures. All of these need explicit handling when you run your own browser. With the API you get a clear status: "failed" response with an error code and message, without learning exception hierarchies or maintaining piles of rescue blocks.

Which approach to choose for your project

Selenium makes sense if you're already using it for integration tests and want to reuse that same infrastructure for occasional screenshots. The ecosystem is mature, most Ruby developers are familiar with it, and for one-off tasks it usually does the job.

If you're building something from scratch and screenshots are the main focus, I'd go with Ferrum. It's lighter, doesn't drag in Java dependencies, and native full-page capture plus network idle waiting work out of the box. For screenshot-oriented projects it's more convenient.

And when you don't want to think about Chrome on your server at all, especially in cloud or serverless environments, a screenshot API handles it in a single HTTP call. Dark mode, ad blocking, element screenshots, PDF export are all built in. The same approach works in other languages too: I've written parallel guides for Node.js, PHP, Python, and Go. The flow is the same everywhere, only the HTTP client changes.

More from the blog

View all posts

Jun 03, 2026

How to handle screenshot API responses in production

A 200 OK from a screenshot API doesn't mean you got a screenshot — the transport and render layers fail independently. Which status codes to retry and which not, backoff with jitter, respecting Retry-After, catching blank images that pass as a 200, and a circuit breaker. Node.js code throughout.

May 09, 2026

Screenshot API rate limiting strategies in production

Most rate limiting guides only cover retry strategies. That's only half the problem. Five concrete strategies — proactive (token bucket, queue) and reactive (Retry-After, exponential backoff, circuit breaker) — with Node.js code.

May 08, 2026

Headless Chrome "net::ERR_CONNECTION_REFUSED" in Docker: causes and fixes

ERR_CONNECTION_REFUSED in headless Chrome inside Docker isn't one error — it's five different network problems sharing the same message. Diagnose with one curl from inside the container, then fix per cause.