Benchmarking Proxy Performance: Speed, Uptime, Success Rates

Learn how to benchmark proxy performance with rigorous methodology. Measure success rates, latency, uptime, and IP quality across datacenter, residential, and mobile proxies.

Why Benchmarks Matter and Why Most Are Useless

Proxy performance benchmarks are everywhere, and most of them are worthless. Provider-published benchmarks use ideal conditions against fast targets with no anti-bot protection. Third-party reviews test 50 requests and draw sweeping conclusions. Neither tells you how a proxy will perform in your production environment against your targets at your scale.

A useful benchmark answers a specific operational question: will this proxy deliver the success rate and speed I need for my workload? Answering that question requires controlled methodology, sufficient sample sizes, realistic test targets, and honest analysis that accounts for the inherent variability of proxy networks. A benchmark that tells you a proxy has 150ms average latency is meaningless without knowing the target, the geographic configuration, the time of day, and the sample size.

Good proxy performance benchmarks are expensive to run. They require thousands of requests across hours or days, against multiple targets, from multiple geographic configurations. This investment pays for itself by preventing the much larger cost of choosing the wrong provider for a production workload. Discovering poor performance after you have built your pipeline around a provider means migration costs, delayed projects, and lost data.

What to Benchmark: The Core Metrics

A complete proxy benchmark suite measures six dimensions. Skipping any of them leaves gaps that surface as production surprises.

Success rate per 1,000 requests The percentage of requests that return HTTP 200 with valid content. This is the single most important metric. Measure it per target domain because success rates vary dramatically across sites. A 95% success rate against one target and 60% against another with the same proxy are both valid data points.
Average response time (TTFB) Time from request sent to first byte received. Includes proxy overhead, IP assignment, connection to target, and target processing time. Report median and mean separately. Mean is skewed by outliers, median represents typical experience.
P95 and P99 latency The tail latency that your slowest requests experience. P95 is what 95% of requests complete within. P99 captures extreme outliers. Your timeout settings should be based on P95 or P99, not average. A proxy with 200ms median but 5,000ms P99 will cause frequent timeouts if your timeout is set to 2 seconds.
Uptime Percentage of time the proxy service accepts and processes connections. Measure over 7+ days. Some providers quote 99.9% uptime but experience multi-minute outages during peak hours that 99.9% annual uptime math conveniently averages away.
IP diversity Number of unique IPs observed across your test requests. Measures actual pool depth, not advertised pool size. A provider claiming 10 million IPs but rotating you through the same 5,000 is effectively a 5,000-IP pool for your workload.
Geo-accuracy Percentage of requests where the assigned IP's actual geographic location matches the requested location. Test by requesting specific countries or cities and verifying the IP's geo-data against an IP geolocation database.

Designing Fair and Reproducible Benchmarks

A benchmark is only as valid as its methodology. Uncontrolled variables make results unreproducible and comparisons meaningless. Follow these principles to produce benchmarks you can trust and defend.

Control your variables. The only variable that should change between test runs is the proxy provider or configuration being tested. Everything else (target URLs, request rate, concurrency level, geographic targeting, time window, client machine, and network conditions) must remain constant. If you test Provider A on Monday morning and Provider B on Wednesday evening, you are comparing days and times, not providers.

Use sufficient sample sizes. Minimum 1,000 requests per test configuration. Statistical confidence requires volume. At 1,000 requests, your median and P95 values stabilise enough for meaningful comparison. For per-domain success rate measurements, 500 requests per domain is the minimum. Below these thresholds, random variance dominates and your conclusions are noise.

Test against realistic targets. Benchmarking against httpbin.org or a static HTML page tells you the proxy's best-case latency. It tells you nothing about how the proxy performs against Cloudflare-protected e-commerce sites, JavaScript-heavy SPAs, or aggressive anti-bot platforms. Include at least three target categories in your benchmark: an unprotected static site (baseline), a moderately protected site (CDN with basic bot detection), and a heavily protected site (advanced anti-bot like Akamai or PerimeterX).

Document everything. Record test date, time, duration, client location, client specs, proxy configuration, target URLs, concurrency, and any other parameter that could affect results. Future you, or anyone trying to reproduce your results, needs this context.

Expected Performance Ranges by Proxy Type

Each proxy type has inherent performance characteristics determined by its network architecture. These ranges represent what well-optimised providers deliver under normal conditions and give you a baseline for evaluating your benchmark results.

Datacenter proxies:

Median TTFB: 50-200ms
P95 TTFB: 150-500ms
Success rate (unprotected sites): 95-99%
Success rate (protected sites): 60-85%
IP assignment time: 5-20ms

Datacenter proxies are the fastest because both ends of the connection run on high-bandwidth server infrastructure. Their weakness is on protected targets where datacenter IP ranges are flagged by default in anti-bot databases.

Residential proxies:

Median TTFB: 200-800ms
P95 TTFB: 800-2,500ms
Success rate (unprotected sites): 97-99%
Success rate (protected sites): 85-95%
IP assignment time: 20-100ms

Residential IPs trade speed for trust. The higher latency comes from routing through consumer ISP infrastructure. The higher success rate on protected sites comes from the inherent legitimacy of residential IP addresses.

Mobile proxies:

Median TTFB: 300-1,200ms
P95 TTFB: 1,200-4,000ms
Success rate (unprotected sites): 97-99%
Success rate (protected sites): 90-98%
IP assignment time: 50-200ms

Mobile IPs carry the highest trust scores because mobile carrier networks use CGNAT, meaning thousands of legitimate users share the same IP. Blocking a mobile IP risks blocking thousands of real users, so sites are reluctant to ban them. The speed trade-off is significant. Mobile network routing adds substantial latency.

Success Rate Benchmarks: The Metric That Pays the Bills

Success rate is where proxy selection becomes a financial decision. Every failed request costs bandwidth, time, and potentially missed data. A proxy with 90% success rate requires 11% more requests than one with 100% to collect the same data, and those extra requests add cost, time, and detection risk.

Benchmark success rates per target category, not as a single aggregate number. A provider might deliver 98% success on shopping sites but 70% on social media platforms. If your workload is social media scraping, the 98% number is irrelevant.

How to measure success rate correctly:

Count only valid responses. HTTP 200 is not enough. Verify the response contains expected content. A 200 response with a CAPTCHA page, an empty body, or a soft-block page is a failure, not a success. Define content validation rules per target: check for a known HTML element, a minimum response size, or the absence of block indicators.
Separate first-attempt success from retry success. If a request fails and succeeds on retry with a new IP, the first-attempt success rate is what matters for throughput planning. Retry success rate matters for data completeness. Report both.
Track success rate over time. A provider might start at 95% success rate but degrade to 80% over a week as target sites detect and ban their IP ranges. A one-hour benchmark misses this entirely. Run at least a 48-hour benchmark for success rate evaluation.

Realistic success rate targets for production: 90%+ first-attempt success on moderately protected sites with residential proxies. Below 85%, the retry overhead becomes operationally expensive. Below 75%, you should evaluate whether the proxy type matches the target's protection level.

How Pool Size and Diversity Affect Benchmarks

Advertised pool size and actual available pool size are different numbers. A provider claiming 50 million residential IPs might have 50 million IPs in their total network, but only 2 million are online and available at any given moment. The number that matters for your benchmark is how many unique IPs you actually receive during testing.

Measure effective pool size by tracking unique IPs across your benchmark requests. In 10,000 requests with random rotation, how many unique IPs appeared? If you see 8,000 unique IPs, the effective pool depth for your configuration is at least 8,000, likely much larger since random sampling does not exhaust the pool. If you see only 500 unique IPs across 10,000 requests, the pool is shallow regardless of what the provider advertises.

Pool diversity matters as much as pool size. 10,000 IPs from a single ISP in one city are less effective than 5,000 IPs spread across 50 ISPs in 20 cities. Anti-bot systems track traffic patterns at the subnet and ASN level. A flood of requests from one ASN is suspicious even if individual IPs are unique. Measure the number of unique subnets (/24 blocks) and ASNs in your benchmark to assess diversity.

Pool freshness is the hidden dimension. IPs that have been in the proxy pool for months accumulate reputation damage across target sites. Fresh IPs that recently joined the pool carry clean reputations. Some providers actively rotate their pool, retiring damaged IPs and onboarding fresh ones. You can indirectly measure this by tracking whether your success rate improves or degrades when the provider expands their pool. Fresh IPs improve aggregate success rates.

Time-of-Day and Day-of-Week Performance Variation

Proxy performance is not constant throughout the day. Both the proxy infrastructure and target sites experience load patterns that create predictable performance windows. A benchmark that runs only during off-peak hours produces optimistic results that will not hold during your production peak.

Proxy-side variation: Shared proxy pools experience higher contention during business hours in major markets. A US residential pool is busiest between 9 AM and 6 PM Eastern Time, when the most users are competing for IPs. During these windows, IP assignment may be slower, less-optimal IPs may be served as premium IPs are claimed by other users, and gateway processing latency may increase under load. Weekend and nighttime performance typically improves as contention drops.

Target-side variation: Target websites also have peak hours. E-commerce sites are slowest during lunch hours and evening shopping peaks. Their servers respond slower, their CDNs are under heavier load, and their anti-bot systems may become more aggressive during traffic spikes. A proxy benchmark against an e-commerce target at 3 AM will show faster response times and higher success rates than the same test at 7 PM.

The correct benchmarking approach is to run continuous tests across a full 24-hour cycle (minimum) and segment results by hour. This produces a performance heatmap showing your best and worst windows. Design your production schedules around these patterns. Schedule heavy scraping during off-peak hours when both proxy and target performance is optimal, and reserve peak hours for time-sensitive tasks that cannot be deferred.

Geographic Performance Differences

Geographic configuration has a larger impact on proxy performance than most users realise. The same provider can deliver 150ms TTFB for US targets from US proxies and 900ms TTFB for Japanese targets from Japanese proxies. Not because the Japanese pool is worse, but because of infrastructure differences in different regions.

Factors that create geographic performance gaps:

Pool depth per region Providers have vastly different IP counts per country. A provider with 5 million US IPs might have only 50,000 in Brazil and 10,000 in Poland. Smaller pools mean more IP reuse, higher detection rates, and slower rotation as the gateway searches for available IPs.
Gateway proximity If the provider operates gateways only in the US and Europe, requests targeting Asian sites must traverse additional network hops. Each hop adds latency and potential points of failure.
ISP infrastructure quality Residential proxy performance depends on the underlying ISP infrastructure. Proxies routed through fibre-optic ISPs in South Korea will consistently outperform those routed through congested DSL networks in developing regions.
Regional anti-bot posture Sites in some regions deploy more aggressive anti-bot measures than others. Japanese e-commerce platforms are notoriously protective. Success rates benchmarked against these sites will be lower than benchmarks against comparable sites in less-protected markets.

Benchmark every geographic region you plan to use independently. Do not extrapolate US performance to other regions. The only way to know is to test each one.

Continuous Monitoring vs One-Time Testing

One-time benchmarks are snapshots. They tell you how a proxy performed on a specific day under specific conditions. They do not tell you how it will perform next week, next month, or during a traffic spike on your target site. Continuous monitoring closes this gap by treating benchmarking as an ongoing process rather than a one-time evaluation.

Implement continuous monitoring by running a standardised test suite at regular intervals (hourly, daily, or weekly) depending on how critical proxy performance is to your operations. The test suite should be identical each time: same targets, same request volume, same geo-targeting, same metrics collection. Consistency in methodology is what makes longitudinal data comparable.

Set up alerts on key thresholds. When success rate drops below 90%, when P95 latency exceeds 3,000ms, or when uptime dips below 99% over a rolling 24-hour window, you want to know immediately. Not when your data pipeline fails three hours later. These alerts give you time to investigate and switch providers or adjust configurations before the degradation affects production output.

Continuous monitoring also provides use in provider negotiations. When you have six months of performance data showing that success rates dropped from 94% to 87% over a quarter, you have concrete evidence to request service credits, demand infrastructure improvements, or justify switching to a competitor. Providers respond differently to customers who bring data versus customers who bring complaints.

Using Benchmarks to Negotiate Service Terms and Compare Providers

Benchmark data transforms proxy procurement from a trust-based decision to a data-driven one. Instead of choosing a provider based on marketing claims or subjective reviews, you choose based on measured performance against your actual requirements.

Building a provider comparison matrix:

Test three to five providers under identical conditions. For each provider, record: success rate per target category, median and P95 TTFB, IP diversity (unique IPs per 1,000 requests), geo-accuracy, and cost per successful request. The last metric, cost per successful request, is the most revealing. Calculate it as: (total proxy cost for the test period) divided by (number of successful requests). A cheap provider with low success rates often has a higher cost per successful request than a premium provider with high success rates.

Negotiating service terms with data:

Most proxy providers offer generic service agreements with vague performance commitments. Your benchmark data lets you negotiate specific, measurable terms:

Minimum success rate per target category (e.g., 92%+ on protected e-commerce sites)
Maximum P95 latency for specified geographic configurations
Minimum unique IPs per 10,000 requests
Credit or refund triggers when performance drops below agreed thresholds

Providers with confidence in their infrastructure will agree to specific metrics. Providers who resist measurable commitments are telling you something about their consistency. The benchmark data you bring to the negotiation is your strongest tool. It demonstrates that you measure performance rigorously and will hold the provider accountable to concrete standards.

Frequently Asked Questions

How many requests do I need for a statistically valid proxy benchmark?

A minimum of 1,000 requests per test configuration gives you stable median and P95 values. For per-domain success rate measurements, 500 requests per domain is the minimum. Below these thresholds, random variance in proxy assignment and target behaviour dominates your results, making comparisons unreliable. For production-critical decisions, 5,000+ requests per configuration over 24+ hours provides high-confidence results that account for time-of-day variation.

Why does my residential proxy benchmark show much higher latency than the provider advertises?

Providers typically advertise gateway latency, the time for a request to reach their servers, not end-to-end TTFB including the residential exit node and target server. Residential IPs route through consumer ISP infrastructure, which adds 100-500ms compared to datacenter connections. Additionally, your benchmark includes target server processing time, which the provider's advertised numbers exclude. Always compare your benchmark TTFB to realistic residential ranges (200-800ms median), not to provider marketing figures.

Should I benchmark proxy performance against my actual target sites?

Yes, absolutely. Benchmarking against generic speed test sites measures best-case proxy latency, which is irrelevant if your actual targets are Cloudflare-protected e-commerce platforms. Your target sites have specific anti-bot configurations, server response characteristics, and geographic hosting that fundamentally affect proxy performance. Benchmark against a mix of your actual targets and one baseline unprotected site. The baseline isolates proxy overhead, while target benchmarks show real-world performance.

What is a good success rate for residential proxies on protected websites?

For moderately protected sites (CDN with basic bot detection), expect 85-95% first-attempt success rate from quality residential proxies. For heavily protected sites running advanced anti-bot platforms like Akamai or PerimeterX, 75-90% is realistic. Below 75% on protected sites, evaluate whether your request fingerprint, headers, and behaviour patterns need improvement. The proxy IPs may be fine, but your request configuration may be triggering detection independent of the IP.

How often should I re-run proxy performance benchmarks?

For mission-critical workloads, run automated benchmark suites weekly. For standard workloads, monthly is sufficient. Always re-benchmark after any significant change: switching proxy providers, targeting new websites, changing geographic configurations, or after the provider announces infrastructure changes. Proxy performance is dynamic. IP pools refresh, target sites update anti-bot systems, and provider infrastructure evolves. Stale benchmarks lead to stale decisions.

Written by

Maria Kovacs

Content Manager at Databay

Maria is the Content Manager at Databay, where she covers proxy technology, web scraping techniques, and online privacy. With a background in technical writing and digital marketing, she turns complex networking topics into practical, actionable guides for developers and data teams.

Benchmarking Proxy Performance: Speed, Uptime, Success Rates

Why Benchmarks Matter and Why Most Are Useless

What to Benchmark: The Core Metrics

Designing Fair and Reproducible Benchmarks

Expected Performance Ranges by Proxy Type

Success Rate Benchmarks: The Metric That Pays the Bills

How Pool Size and Diversity Affect Benchmarks

Time-of-Day and Day-of-Week Performance Variation

Geographic Performance Differences

Continuous Monitoring vs One-Time Testing

Using Benchmarks to Negotiate Service Terms and Compare Providers

Frequently Asked Questions

Maria Kovacs

Start Collecting Data Today

Latest from the Blog

Static vs Rotating Proxies: Choosing the Right Approach

How to Scrape Google Search Results Without Getting Blocked

Proxy Industry Trends in 2026: What's Changing and Why

Start Using Rotating Proxies Today