Puppeteer Proxy Integration
Configure Databay residential and datacenter proxies in Puppeteer using --proxy-server and page.authenticate, then add rotation, error handling and timeout tuning. This guide covers working code for every step, the proxy-auth quirks specific to Chromium, and the failure modes you will actually hit in production: 407 challenges, tunnel failures, TLS surprises and slow navigations.

What Is Puppeteer
Puppeteer is a Node.js library maintained by the Chrome team that drives Chrome or Chromium over the DevTools Protocol. It is the default choice for JavaScript developers who need a real browser for scraping, testing, PDF generation or screenshot pipelines: pages execute their full JavaScript, so content that never appears in raw HTML responses is reachable. The trade-off is that a real browser announces itself loudly: every request leaves from your machine's IP, and a burst of headless traffic from one address is one of the easiest patterns for a target site to spot. Routing Puppeteer through rotating proxies separates your scraping identity from your infrastructure, spreads requests across many exit IPs, and lets you pick the country a page is rendered from.
Connecting Puppeteer to Databay Proxies
Databay exposes a single gateway endpoint, gw.databay.co:8888, and you select the proxy pool, country and session behaviour through flags appended to your username. That means the Puppeteer side is always the same two steps: pass the gateway to Chromium with --proxy-server, then supply credentials with page.authenticate(). The split matters because Chromium does not accept credentials embedded in the --proxy-server URL. If you write --proxy-server=http://user:pass@host:port, Chromium silently strips the credentials and you get a 407 on the first navigation. Authentication must happen through the DevTools protocol, which is exactly what page.authenticate() does.
Residential Proxy Setup
Residential proxies exit through real household connections, which makes them the right pool for targets that aggressively filter datacenter IP ranges. The zone is selected with -zone-residential in the username:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: true,
args: ['--proxy-server=http://gw.databay.co:8888']
});
const page = await browser.newPage();
await page.authenticate({
username: 'USER-zone-residential',
password: 'PASS'
});
await page.goto('https://httpbin.org/ip', { waitUntil: 'domcontentloaded' });
console.log(await page.evaluate(() => document.body.innerText));
await browser.close();
})();Two details are load-bearing here. First, page.authenticate() must be called before the first page.goto(); it registers a handler for the proxy's 407 challenge, so calling it after navigation has already failed does nothing. Second, the --proxy-server argument applies to the whole browser process, not just one page; every page you open in this browser uses the gateway.
Datacenter Proxy Setup
Datacenter proxies are faster and cheaper per gigabyte than residential IPs, and they are the sensible default for targets that do not score IP reputation: internal tools, APIs without aggressive bot filtering, and high-volume jobs against permissive sites. Only the username changes:
const page = await browser.newPage();
await page.authenticate({
username: 'USER-zone-datacenter',
password: 'PASS'
});A practical pattern is to start each new target on datacenter IPs and move that target to the residential zone only when you observe blocks or CAPTCHAs. Because the zone lives in the username, switching pools is a one-line change with no other code differences.
Country Targeting and Sticky Sessions
Geo-targeting and session control are also username flags. Appending -countryCode-us pins the exit IP to the United States; appending -sessionId-abc123 keeps the same exit IP across requests for as long as the session stays alive, which is what you want when a flow spans multiple pages behind a login or a cart:
// US exit IP, same IP for the whole browsing session
await page.authenticate({
username: 'USER-zone-residential-countryCode-us-sessionId-abc123',
password: 'PASS'
});The session id is an arbitrary string you choose. Reusing the same id pins the IP; generating a new id releases it. This is the foundation of every rotation pattern in the next section.
Proxy Rotation Patterns
A rotating gateway assigns exit IPs per connection, but a browser is not a polite single-connection client: one page load can open dozens of parallel connections for HTML, scripts, images and XHR. If those connections each rotate, a single page render arrives at the target from several IPs at once, which looks stranger than no proxy at all. The rule of thumb with any real browser is therefore: one sticky session per logical browsing identity, and rotate between identities, not within them.
The simplest reliable pattern is one browser per session. Launch, authenticate with a fresh sessionId, do the work, close:
function freshSession() {
const id = Math.random().toString(36).slice(2, 10);
return {
username: `USER-zone-residential-sessionId-${id}`,
password: 'PASS'
};
}
for (const url of urls) {
const browser = await puppeteer.launch({
args: ['--proxy-server=http://gw.databay.co:8888']
});
const page = await browser.newPage();
await page.authenticate(freshSession());
await page.goto(url, { waitUntil: 'domcontentloaded' });
// ... extract ...
await browser.close();
}Launching a browser per task is heavy, so recent Puppeteer versions offer a lighter alternative: per-context proxies. browser.createBrowserContext() accepts a proxyServer option, giving each context its own proxy and its own cookies, cache and storage:
const browser = await puppeteer.launch();
const context = await browser.createBrowserContext({
proxyServer: 'http://gw.databay.co:8888'
});
const page = await context.newPage();
await page.authenticate(freshSession());
await page.goto('https://httpbin.org/ip');
// ...
await context.close();Contexts are cheap to create and destroy, so a pool of a few contexts, each pinned to its own sticky session, gives you parallel identities inside one browser process. One honest caveat: Chromium caches proxy credentials at the network layer, so swapping page.authenticate() values between pages of the same context does not always retrigger the 407 challenge on reused tunnels. If you see one session's IP bleed into another, isolate the sessions in separate contexts or separate browsers rather than fighting the credential cache. For background on when rotating beats sticky and vice versa, see static vs rotating proxies.
Common Errors and Fixes
Proxy problems in Puppeteer surface as a handful of recognisable error strings. The four below cover the overwhelming majority of real-world failures.
HTTP 407 Proxy Authentication Required
A 407 means the gateway never received valid credentials. In Puppeteer the usual causes are, in order of likelihood: page.authenticate() was never called on this page (it is per-page, and a page created with browser.newPage() after the first one does not inherit it); it was called after page.goto() instead of before; or credentials were embedded in the --proxy-server URL, which Chromium strips. The fix is mechanical: call page.authenticate({ username, password }) on every page you create, before its first navigation. If credentials are definitely flowing and you still get 407, verify them outside the browser:
curl -x http://USER-zone-residential:PASS@gw.databay.co:8888 https://httpbin.org/ipIf curl succeeds and Puppeteer fails, the problem is in how the browser is wired up, not the credentials.
ERR_TUNNEL_CONNECTION_FAILED
Chromium raises net::ERR_TUNNEL_CONNECTION_FAILED when the CONNECT tunnel to an HTTPS site cannot be established through the proxy. Check three things. First, the endpoint: it must be exactly gw.databay.co:8888; a typo in host or port fails at the tunnel stage rather than at DNS. Second, the username flags: a misspelled zone or an invalid flag combination (for example a malformed countryCode value) can cause the gateway to refuse the tunnel rather than return a clean 407. Third, the target: if curl through the same proxy reaches https://httpbin.org/ip but not your target, the target is refusing the exit IP, so rotate the session and retry rather than debugging your own configuration.
TLS and Certificate Errors
HTTPS traffic through the gateway travels in a CONNECT tunnel, end-to-end encrypted; the proxy does not terminate or re-sign TLS. So a certificate error inside Puppeteer is almost never caused by the proxy itself. The usual suspects are corporate middleboxes or antivirus software intercepting TLS on your own network, a genuinely misconfigured target site, or an outdated Chromium bundle. Puppeteer's escape hatch is puppeteer.launch({ acceptInsecureCerts: true }), and it is fine for hitting a staging server with a self-signed certificate, but do not run it as a blanket default for scraping: it silences exactly the warning that would tell you someone is interfering with your traffic.
Timeouts and Slow Pages
Residential exits add real latency: the request hops through an actual household connection, and pages that load in two seconds directly may take ten or more. Puppeteer's default 30-second navigation timeout is frequently too tight. Three adjustments help:
page.setDefaultNavigationTimeout(60000);
await page.goto(url, { waitUntil: 'domcontentloaded' });
// Skip images, media and fonts: faster loads, less proxy bandwidth
await page.setRequestInterception(true);
page.on('request', (req) => {
['image', 'media', 'font'].includes(req.resourceType())
? req.abort()
: req.continue();
});Waiting for domcontentloaded instead of networkidle0 avoids stalling on analytics beacons and long-polling connections that never go idle. Request interception does double duty: pages finish sooner and you stop paying proxy bandwidth for images you never look at. If a page times out repeatedly on one session, treat it as a degraded exit IP and rotate instead of raising the timeout further.
Best Practices for Puppeteer with Proxies
- One identity, one session. Give each logical browsing identity its own sticky
sessionIdand its own browser context. Rotate between identities, never mid-pageload. - Rotate on signal, not on schedule. A 403, a 429 or a CAPTCHA page is the signal to switch to a fresh session id. Rotating after every successful request wastes the trust an unblocked IP has built.
- Block what you do not need. Request interception that drops images, media and fonts typically cuts page weight dramatically, which on metered residential bandwidth is money.
- Look like a browser someone uses. Set a realistic viewport and user agent, and remember that IP rotation does not hide browser automation itself; how that detection works is covered in how sites detect headless browsers.
- Verify the exit before the run. A quick
page.goto('https://httpbin.org/ip')at startup confirms the proxy is wired correctly and logs which exit IP the session received. - Throttle per target. Concurrency limits and per-domain delays keep individual exit IPs under the rate thresholds that trigger blocks in the first place.
For the wider scraping picture beyond browser wiring, the web scraping with proxies guide covers strategy, target selection and block handling in depth.
Frequently Asked Questions
Why must page.authenticate() be called before page.goto()?
Can different pages in the same browser use different proxies?
Do Databay proxies work with puppeteer-extra and the stealth plugin?
Should I use residential or datacenter proxies with Puppeteer?
How do I keep the same IP across a multi-page login flow?
Start Using Databay Proxies Today
Set up residential, datacenter, or mobile proxies in minutes. Pay as you go with no commitments.