Case Study: E-commerce Price Monitoring Agency Scales to 5M Daily SKU Scrapes With Residential Proxies

Sophie Marchand Sophie Marchand 7 min read

How a competitive price intelligence agency migrated to Databay residential proxies to scrape 5 million product SKUs daily across 40 retailers, cutting infrastructure cost 58% while improving success rate to 99.1%. Names anonymized by mutual agreement.

The Customer

An Amsterdam-based competitive intelligence agency. They provide price monitoring and assortment tracking services to consumer brands across Europe and North America. The platform watches product availability, pricing, and promotional activity across 40+ major online retailers. That mix includes general marketplaces and specialty retailers in apparel, electronics, and grocery verticals. Their core deliverable is a daily updated dashboard their brand clients use to adjust pricing, spot unauthorized resellers, and detect stockout risk.

Customer name and specific retailer targets are anonymized at the agency's request. Figures below are rounded to the nearest reportable range.

The Challenge

The agency's original scraping stack was a mix of free rotating proxy lists and mid-tier datacenter proxies. As they scaled from covering 500,000 SKUs to 5 million SKUs daily across 40 retailers, three operational problems surfaced:

  • Success rate collapse. Datacenter IPs were being blocked at increasing rates on the major retail targets, especially on platforms running Akamai Bot Manager and Cloudflare's advanced bot protection. The effective success rate dropped from 92% to roughly 67% over six months.
  • Quality degradation on price data. Datacenter IPs frequently got dynamic pricing treatment different from real consumer IPs. They saw generic catalog prices instead of region-specific localised pricing. That corrupted the core data deliverable to clients.
  • Cost inflation from retries. High failure rates meant the agency was retrying each target 3-5 times. That tripled effective bandwidth cost and doubled infrastructure spend on worker instances.

The engineering team concluded that ISP-registered residential IPs were the only viable path to sustainable scaling at their target volume. They were still hesitant about the headline per-GB cost of residential compared to their existing datacenter baseline.

The Evaluation

The agency tested five commercial residential proxy providers over a 4-week pilot. The evaluation criteria:

  • Success rate on anti-bot protected targets. Tested against five retailer platforms running Akamai, Cloudflare Bot Management, DataDome, and PerimeterX.
  • City-level geo-targeting accuracy. Localised pricing needs real consumer IPs in specific metropolitan areas, not just country-level.
  • Session continuity. Multi-step checkout scraping (add-to-cart to observe final price with taxes and shipping) needs sticky sessions of 15-30 minutes.
  • Per-GB effective cost including retries. Raw per-GB rate matters less than the cost per successfully-collected SKU.
  • API and usage analytics. Production monitoring, team access management, and usage forecasting.

Databay was one of two finalists. During pilot testing against the five target retailers, Databay's residential network returned a 99.1% success rate over 50,000 pilot requests. City-level targeting accuracy verified at 97% of requests. Sticky sessions were natively supported up to 120 minutes, comfortably longer than the 30-minute maximum the scraping workflows needed.

The Implementation

Migration from the previous provider to Databay ran across three weekly rollout stages:

  • Week 1: Route 10% of production traffic through Databay gateway for two lower-risk retailer targets. Monitor success rate, latency, and data accuracy against the production baseline.
  • Week 2: Expand to 50% of production traffic across all 40 retailers. Begin decommissioning previous proxy contracts.
  • Week 3: Complete migration to 100% Databay traffic. Full cut-over.

Technical integration was clean because both providers expose a similar backconnect gateway with geo-targeting in the proxy username. The agency's existing worker code (Python Scrapy with custom middleware) needed changes in one configuration file: two environment variable updates. Sticky session logic was adjusted to use Databay's session-id parameter syntax.

The agency opted for the Databay Enterprise tier (1 TB+ commitment) at $0.65/GB, reflecting their projected 4-5 TB monthly residential bandwidth consumption.

The Results

Measured against the prior 90-day baseline:

  • Success rate improved from 67% to 99.1%. First-attempt success rate on anti-bot protected targets rose sharply. Retry overhead dropped by roughly 80%.
  • Infrastructure cost decreased 58% net. The higher per-GB residential cost was more than offset by eliminated retry overhead and reduced worker instance count. Total monthly proxy + compute spend dropped from roughly $42,000 to $17,500.
  • Localised pricing accuracy improved from 78% to 94%. City-level residential IPs surfaced region-specific pricing that datacenter IPs were missing, improving core data product quality.
  • SKU coverage capacity grew from 5 million to 11 million daily within the same budget. The cost efficiency gain let the agency expand coverage to new retailers without extra spend.
  • Client satisfaction score rose from 7.8 to 9.2 (agency's internal quarterly client survey, 10-point scale).

The lead engineer noted that the main driver of savings was not the proxy rate itself. It was the cascading efficiency gains from higher success rates: fewer retries, fewer worker instances, less human QA time catching bad data, and measurably better data quality for clients.

Key Takeaways

The lessons from this migration generalise to most high-volume e-commerce scraping workloads:

  • Total cost matters more than per-GB rate. A higher-quality proxy that eliminates retries and improves data accuracy often pays for itself several times over in reduced compute and QA overhead.
  • City-level geo-targeting materially affects e-commerce data quality. Many retailers show region-specific pricing that country-level proxies miss. For competitive intelligence specifically, this quality gap matters more than scraping volume.
  • Sticky sessions are essential for checkout-flow price observation. Static pricing scraping can use rotating proxies. Observing final prices with taxes, shipping, and regional adjustments needs session continuity.
  • Pilot against specific targets. Residential proxy pool quality varies by provider. Always pilot against your actual production targets before committing to a volume tier.
  • Measure first-attempt success rate, not just overall success. A 67% first-attempt with 92% eventual success is operationally very different from 99% first-attempt with 99.5% eventual success.

About Databay Residential Proxies

Databay operates 34M+ ethically sourced residential proxy IPs across 200+ countries and 50,000+ cities. Features relevant to e-commerce price monitoring workloads include:

  • City-level, ZIP-code, GPS-coordinate, and ASN targeting at no extra charge
  • Rotating sessions for broad IP diversity
  • Sticky sessions up to 120 minutes for checkout-flow observation
  • HTTP, HTTPS, and SOCKS5 protocols
  • Unlimited concurrent connections and no bandwidth throttling
  • Pay-as-you-go pricing from $2.75/GB with Enterprise tier at $0.65/GB for 1TB+ commitments

Learn more: Residential Proxies · Residential Pricing · Complete E-commerce Price Monitoring Guide

Case study results are specific to this customer's workload, targets, and implementation. Your mileage may vary depending on specific retail targets, scraping architecture, and geo-targeting mix. All customer names and specific retailer targets are anonymized at the customer's request.

Frequently Asked Questions

What kind of price monitoring can Databay residential proxies support?
Databay residential proxies support competitive pricing, promotional monitoring, assortment tracking, stockout detection, MAP enforcement, and dynamic pricing observation across e-commerce retailers. City-level targeting captures region-specific pricing, and sticky sessions up to 120 minutes support multi-step checkout-flow observation.
How many SKUs can I scrape per day with Databay residential proxies?
There is no per-account SKU limit. Throughput depends on your worker architecture and bandwidth budget. The case study above processed 11 million SKUs daily on approximately 4-5 TB monthly residential bandwidth with a 99.1% success rate. A typical simple product page is 100-500 KB per scrape.
Does Databay support the Akamai, Cloudflare, and DataDome targets common in e-commerce?
Databay residential proxies provide the ISP-registered IP foundation that allows requests to pass initial anti-bot classification filters on Akamai, Cloudflare, DataDome, and PerimeterX. Additional scraping-side measures (proper request headers, browser fingerprinting, rate control, and retry logic) are still required for end-to-end success.
How do I estimate monthly bandwidth for price monitoring?
Multiply (SKUs scraped per day) × (average response size in KB) × 30 days / 1,048,576 = monthly GB. Typical e-commerce product pages are 200 KB to 2 MB depending on whether you load images and assets. Disabling image loading in your scraper can cut bandwidth by 70-90%.

Start Collecting Data Today

34M+ IPs across 200+ countries. Pay as you go, starting at $0.50/GB.

Latest from the Blog

Expert guides on proxies, web scraping, and data collection.

Start Using Rotating Proxies Today

Join 8,000+ users using Databay's rotating proxy infrastructure for web scraping, data collection, and automation. Access 34M+ residential, datacenter, and mobile IPs across 200+ countries with pay-as-you-go pricing from $0.50/GB. No monthly commitment, no connection limits - start collecting data in minutes.