GlossaryToolsIntermediate

Puppeteer

Puppeteer is a Node.js library by Google that controls Chrome and Chromium through the DevTools Protocol. It is popular for scraping JavaScript-rendered pages and automating Chrome tasks.

Last updated June 8, 2026

Definition

Puppeteer is an open-source browser automation library maintained by Google's Chrome team. It drives Chrome and Chromium (with experimental Firefox support) via the Chrome DevTools Protocol, and is one of the most widely used tools for web scraping dynamic sites and automating browser workflows.

Language support

Puppeteer is primarily a Node.js (JavaScript/TypeScript) library. A community port called Pyppeteer exists for Python, but the JavaScript ecosystem is the most mature and actively maintained.

How it works and why it matters

Puppeteer launches a full headless browser that runs JavaScript and renders the DOM, so it can scrape content that plain HTTP requests cannot reach. It supports clicking, typing, navigation, screenshots, and PDF generation.

Proxy support: Pass --proxy-server=host:port in launch args; proxy authentication is handled with page.authenticate().
Stealth: The puppeteer-extra-plugin-stealth package helps reduce automation fingerprints and bot detection.

Routing Puppeteer through rotating residential proxies lets you scrape at scale while avoiding IP-based rate limiting and bans.

Examples

Launching with a proxy: puppeteer.launch({ args: ['--proxy-server=http://proxy:8000'] })

Authenticating a proxy via page.authenticate({ username, password })

Using puppeteer-extra-plugin-stealth to evade bot detection while scraping

Common Use Cases

Scraping JavaScript-rendered content from dynamic sites

Generating PDFs and screenshots of web pages

Automating Chrome-based workflows and form submissions

Pre-rendering single-page apps for SEO

Frequently Asked Questions

Puppeteer is a Node.js library, so it is used with JavaScript or TypeScript. A community Python port called Pyppeteer exists but is less actively maintained.

Pass --proxy-server=host:port in the launch args, then call page.authenticate({ username, password }) if the proxy requires credentials.

Puppeteer is built for Chrome and Chromium and has experimental Firefox support. For broad multi-browser coverage, Playwright or Selenium are better choices.

Keep Learning

All terms

Web Scraping

Web scraping is the automated extraction of data from websites — fetching pages programmatically and parsing their content into structured data.

Read definition

Anti-Detect Browser

An anti-detect browser lets you run many isolated browser profiles, each with its own fingerprint, cookies and proxy, so sites see them as separate, genuine users.

Read definition

Rate Limiting

Rate limiting restricts how many requests a client can make in a given time, and it is one of the most common defenses scrapers must work around.

Read definition

Headless Browser

A headless browser is a real browser that runs without a visible interface, controlled by code — the workhorse for scraping JavaScript-heavy sites and automation.

Read definition

Back to Glossary

Puppeteer

Definition

Language support

How it works and why it matters

Examples

Common Use Cases

Frequently Asked Questions

Keep Learning

Web Scraping

Anti-Detect Browser

Rate Limiting

Headless Browser

Company

Legal