GlossaryToolsIntermediate

Puppeteer

Puppeteer is a Node.js library by Google that controls Chrome and Chromium through the DevTools Protocol. It is popular for scraping JavaScript-rendered pages and automating Chrome tasks.

Last updated June 8, 2026

Definition

Puppeteer is an open-source browser automation library maintained by Google's Chrome team. It drives Chrome and Chromium (with experimental Firefox support) via the Chrome DevTools Protocol, and is one of the most widely used tools for web scraping dynamic sites and automating browser workflows.

Language support

Puppeteer is primarily a Node.js (JavaScript/TypeScript) library. A community port called Pyppeteer exists for Python, but the JavaScript ecosystem is the most mature and actively maintained.

How it works and why it matters

Puppeteer launches a full headless browser that runs JavaScript and renders the DOM, so it can scrape content that plain HTTP requests cannot reach. It supports clicking, typing, navigation, screenshots, and PDF generation.

  • Proxy support: Pass --proxy-server=host:port in launch args; proxy authentication is handled with page.authenticate().
  • Stealth: The puppeteer-extra-plugin-stealth package helps reduce automation fingerprints and bot detection.

Routing Puppeteer through rotating residential proxies lets you scrape at scale while avoiding IP-based rate limiting and bans.

Examples

1

Launching with a proxy: puppeteer.launch({ args: ['--proxy-server=http://proxy:8000'] })

2

Authenticating a proxy via page.authenticate({ username, password })

3

Using puppeteer-extra-plugin-stealth to evade bot detection while scraping

Common Use Cases

Scraping JavaScript-rendered content from dynamic sites
Generating PDFs and screenshots of web pages
Automating Chrome-based workflows and form submissions
Pre-rendering single-page apps for SEO

Frequently Asked Questions

Puppeteer is a Node.js library, so it is used with JavaScript or TypeScript. A community Python port called Pyppeteer exists but is less actively maintained.
Pass --proxy-server=host:port in the launch args, then call page.authenticate({ username, password }) if the proxy requires credentials.
Puppeteer is built for Chrome and Chromium and has experimental Firefox support. For broad multi-browser coverage, Playwright or Selenium are better choices.