Case Study: E-commerce Price Monitoring Agency Scales to 5M Daily SKU Scrapes With Residential Proxies

Sophie Marchand Sophie Marchand 7 min read

How a competitive price intelligence agency migrated to Databay residential proxies to scrape 5 million product SKUs daily across 40 retailers, cutting infrastructure cost 58% while improving success rate to 99.1%. Names anonymized by mutual agreement.

The Customer

An Amsterdam-based competitive intelligence agency providing price monitoring and assortment tracking services to consumer brands in Europe and North America. The agency's platform continuously monitors product availability, pricing, and promotional activity across 40+ major online retailers including general marketplaces and specialty retailers in apparel, electronics, and grocery verticals. Their core deliverable is a daily updated dashboard their brand clients use to adjust pricing, identify unauthorized resellers, and detect stockout risk.

Customer name and specific retailer targets are anonymized at the agency's request. Figures below are rounded to the nearest reportable range.

The Challenge

The agency's original scraping stack relied on a mix of free rotating proxy lists and mid-tier datacenter proxies. As they scaled from covering 500,000 SKUs to 5 million SKUs daily across 40 retailers, three operational problems emerged:

  • Success rate collapse. Datacenter IPs were being blocked at increasing rates on the major retail targets, particularly on platforms running Akamai Bot Manager and Cloudflare's advanced bot protection. The effective success rate dropped from 92% to roughly 67% over six months.
  • Quality degradation on price data. Datacenter IPs frequently received dynamic pricing treatment different from real consumer IPs - receiving generic catalog prices rather than region-specific localized pricing. This corrupted the core data deliverable to clients.
  • Cost inflation from retries. High failure rates meant the agency was retrying each target 3-5 times, tripling their effective bandwidth cost and doubling infrastructure spend on worker instances.

The agency's engineering team concluded that ISP-registered residential IPs were the only viable path to sustainable scaling at their target volume, but was hesitant about the headline per-GB cost of residential compared to their existing datacenter baseline.

The Evaluation

The agency evaluated five commercial residential proxy providers over a 4-week pilot period. The evaluation criteria:

  • Success rate on anti-bot protected targets. Tested against five retailer platforms running Akamai, Cloudflare Bot Management, DataDome, and PerimeterX.
  • City-level geo-targeting accuracy. Localized pricing requires real consumer IPs in specific metropolitan areas, not just country-level.
  • Session continuity. Multi-step checkout scraping (add-to-cart to observe final price with taxes and shipping) requires sticky sessions of 15-30 minutes.
  • Per-GB effective cost including retries. Raw per-GB rate matters less than the cost per successfully-collected SKU.
  • API and usage analytics. Production monitoring, team access management, and usage forecasting.

Databay was one of two finalists. During pilot testing against the five target retailers, Databay's residential proxy network returned a 99.1% success rate over 50,000 pilot requests, city-level targeting accuracy verified at 97% of requests, and native support for sticky sessions up to 120 minutes - comfortably longer than the 30-minute maximum the scraping workflows needed.

The Implementation

Migration from the previous provider to Databay was scheduled across three weekly rollout stages:

  • Week 1: Route 10% of production traffic through Databay gateway for two lower-risk retailer targets. Monitor success rate, latency, and data accuracy relative to the production baseline.
  • Week 2: Expand to 50% of production traffic across all 40 retailers. Begin decommissioning previous proxy contracts.
  • Week 3: Complete migration to 100% Databay traffic. Full cut-over.

Technical integration was straightforward because both providers expose a similar backconnect gateway with geo-targeting in the proxy username. The agency's existing worker code (Python Scrapy with custom middleware) required changes in one configuration file - essentially two environment variable updates. Sticky session logic was adjusted to use Databay's session-id parameter syntax.

The agency opted for the Databay Enterprise tier (1 TB+ commitment) at $0.65/GB, reflecting their projected 4-5 TB monthly residential bandwidth consumption.

The Results

Measured against the prior 90-day baseline:

  • Success rate improved from 67% to 99.1%. First-attempt success rate on anti-bot protected targets increased dramatically. Retry overhead dropped by roughly 80%.
  • Infrastructure cost decreased 58% net. Higher per-GB residential cost was more than offset by eliminated retry overhead and reduced worker instance count. Total monthly proxy + compute spend dropped from approximately $42,000 to $17,500.
  • Localized pricing accuracy improved from 78% to 94%. City-level residential IPs surfaced region-specific pricing that datacenter IPs were missing, improving the core data product quality.
  • SKU coverage capacity grew from 5 million to 11 million daily within the same budget. The cost efficiency gain allowed the agency to expand coverage to new retailers without additional spending.
  • Client satisfaction score rose from 7.8 to 9.2 (agency's internal quarterly client survey, 10-point scale).

The lead engineer noted that the primary driver of savings was not the proxy rate itself but the cascading efficiency gains from higher success rates - fewer retries, fewer worker instances, less human QA time catching bad data, and measurably better data quality for clients.

Key Takeaways

The lessons from this migration translate to most high-volume e-commerce scraping workloads:

  • Total cost matters more than per-GB rate. A higher-quality proxy that eliminates retries and improves data accuracy often pays for itself several times over in reduced compute and QA overhead.
  • City-level geo-targeting materially affects e-commerce data quality. Many retailers show region-specific pricing that country-level proxies miss. For competitive intelligence specifically, this quality gap matters more than scraping volume.
  • Sticky sessions are essential for checkout-flow price observation. Static pricing scraping can use rotating proxies. Observing final prices with taxes, shipping, and regional adjustments requires session continuity.
  • Pilot against specific targets. Residential proxy pool quality varies by provider; always pilot against your actual production targets before committing to a volume tier.
  • Measure first-attempt success rate, not just overall success. A 67% first-attempt with 92% eventual success is operationally very different from 99% first-attempt with 99.5% eventual success.

About Databay Residential Proxies

Databay operates 34M+ ethically sourced residential proxy IPs across 200+ countries and 50,000+ cities. Features relevant to e-commerce price monitoring workloads include:

  • City-level, ZIP-code, GPS-coordinate, and ASN targeting at no extra charge
  • Rotating sessions for broad IP diversity
  • Sticky sessions up to 120 minutes for checkout-flow observation
  • HTTP, HTTPS, and SOCKS5 protocols
  • Unlimited concurrent connections and no bandwidth throttling
  • Pay-as-you-go pricing from $2.75/GB with Enterprise tier at $0.65/GB for 1TB+ commitments

Learn more: Residential Proxies · Residential Pricing · Complete E-commerce Price Monitoring Guide

Case study results are specific to this customer's workload, targets, and implementation. Your mileage may vary depending on specific retail targets, scraping architecture, and geo-targeting mix. All customer names and specific retailer targets are anonymized at the customer's request.

Frequently Asked Questions

What kind of price monitoring can Databay residential proxies support?
Databay residential proxies support competitive pricing, promotional monitoring, assortment tracking, stockout detection, MAP enforcement, and dynamic pricing observation across e-commerce retailers. City-level targeting captures region-specific pricing, and sticky sessions up to 120 minutes support multi-step checkout-flow observation.
How many SKUs can I scrape per day with Databay residential proxies?
There is no per-account SKU limit. Throughput depends on your worker architecture and bandwidth budget. The case study above processed 11 million SKUs daily on approximately 4-5 TB monthly residential bandwidth with 99.1% success rate. Typical simple product page is 100-500 KB per scrape.
Does Databay support the Akamai, Cloudflare, and DataDome targets common in e-commerce?
Databay residential proxies provide the ISP-registered IP foundation that allows requests to pass initial anti-bot classification filters on Akamai, Cloudflare, DataDome, and PerimeterX. Additional scraping-side measures - proper request headers, browser fingerprinting, rate control, and retry logic - are still required for end-to-end success.
How do I estimate monthly bandwidth for price monitoring?
Multiply (SKUs scraped per day) × (average response size in KB) × 30 days / 1,048,576 = monthly GB. Typical e-commerce product pages are 200 KB - 2 MB depending on whether you load images and assets. Disabling image loading in your scraper can reduce bandwidth by 70-90%.

Start Collecting Data Today

34M+ IPs across 200+ countries. Pay as you go, starting at $0.50/GB.

Latest from the Blog

Expert guides on proxies, web scraping, and data collection.

Start Using Rotating Proxies Today

Join 8,000+ users using Databay's rotating proxy infrastructure for web scraping, data collection, and automation. Access 34M+ residential, datacenter, and mobile IPs across 200+ countries with pay-as-you-go pricing from $0.50/GB. No monthly commitment, no connection limits - start collecting data in minutes.