You’re paying for every noisy API call now. That stings during Prime Day or the holiday crush when traffic 10x’s and your integration decides to poll like it’s 2012.
Here’s the fix: design your stack to call less and deliver more. Amazon’s guidance is clear — use event-driven patterns, batch the heavy stuff, harden retries, and stop wasting requests on stale data. That combo trims your Selling Partner API (SP-API) bill and makes your ops feel faster.
Looking for tooling that helps eliminate noisy polls and consolidate requests? Explore Requery.
If you’re a large-scale seller or integrator, this isn’t optional. Under the new SP-API cost framework, unnecessary calls hit margins and increase latency risks right when you need reliability. The move is obvious: fewer, smarter calls.
Below is the playbook. No fluff. You’ll get the best practices Amazon expects, how to wire them, and where teams usually overspend. You’ll go from reactive polling to a durable, event-driven engine that scales cleanly during peaks and sleeps quietly off-peak.
Polling is the tax you pay for not listening. With SP-API Notifications routed through Amazon EventBridge or SQS, you flip the model: Amazon tells you when something changes, and you react. That means fewer wasted calls and fresher data.
AWS CTO Werner Vogels said it best: 'Everything fails, all the time.' Events localize failures. If orders spike, you scale your consumers — not your pollers. If a marketplace is quiet, your system stays quiet.
Events also shrink your “freshness window.” If you poll every 5 minutes, your worst-case delay is 5:00 plus queueing and processing. With events, you’re usually working within seconds. That can hit a service-level goal instead of chasing it.
A simple migration path works:
Start with the changes that drive money or operations:
The pattern: subscribe only to events you act on. Don’t chase vanity signals.
Noise-control checklist:
Operational hardening that pays off:
Example: You’re polling the Orders API every 5 minutes across 7 marketplaces. That’s 288 calls/day/marketplace — 2,016 calls/day — just to check 'anything new?' Switch to order events and your polling shrinks to near zero while your order processing gets faster.
The missing safety net: run a scheduled reconciliation (e.g., hourly or daily) using a recent Order report to find any records you may have missed due to downstream issues. When you detect drift, queue corrective actions and update your last-known checkpoints.
If you’re fetching inventory, pricing, or order summaries object-by-object, you’re burning requests. The Reports API exists to give you a point-in-time snapshot in one go. Generate the report, pick up the file, and hydrate your internal store.
What to move to Reports first:
Best practice: Schedule recurring reports at the smallest cadence your business needs (often 15–60 minutes for most analytics) and stop making per-item reads in that window. Use the report as your cache-of-record until the next run.
Picking cadences that stick:
Implementation tips that save rework:
Updating hundreds or thousands of SKUs? Feeds are your friend. Build deltas and send a batch update rather than firing singular updates. You get fewer requests and more predictable throughput.
Practical flow:
Example: Instead of 10,000 individual price updates (10,000 calls), you push two feed files with 5,000 updates each (2 calls plus document upload). Call volume drops by orders of magnitude, and your failure surface shrinks.
Feed patterns that work:
When to go real-time instead of feeds:
Even then, cap the per-minute updates and fold everything else back into your next feed cycle.
When SP-API returns 429 (throttled) or 5xx (transient), you should retry — slowly and randomly. Exponential backoff with jitter is the standard: double the delay each time and add randomness to avoid retry storms. Respect any Retry-After header if provided.
Simple pattern (pseudo):
AWS guidance specifically recommends jitter to avoid synchronized retries that amplify outages. Treat that as table stakes.
A practical recipe:
Tie retries to context:
Your integration must tolerate duplicates and partial failures. Generate idempotency keys for write operations (e.g., deterministic batch IDs per feed, or stable correlation IDs in your queue). Store processed IDs to drop duplicates. Make your internal state machines re-entrant.
Idempotency toolbox:
Rate limits are per-operation. Expose them to your scheduler. Keep a per-operation token bucket and budget retries against it. When you hit the wall, degrade gracefully: queue work, run reports, or delay non-critical jobs.
Example: You hit a 429 on getCatalogItem. Old you would retry immediately and fail harder. New you backs off with jitter, queues the request behind a token bucket, and uses report data to answer non-urgent queries in the meantime.
Token bucket quick-start:
Observability signals to watch:
It does. Caching high-fanout reads (catalog attributes, ASIN metadata, brand info) turns 100s of calls into 1. Pick TTLs that match business needs: 15–60 minutes for slow-moving catalog data; seconds to minutes for pricing if you also have events.
Tactics that work:
Design a two-level cache:
Avoid cache stampedes:
Most SP-API reads support filters (createdSince, updatedAfter, statuses) and pagination. Use them. Fetch only what changed since your last checkpoint. Never request 'everything' when you only need 'what’s new'.
Paging safety:
Checkpoint design:
Example: Instead of listing all orders for a day (which could be thousands), query createdSince=last_run and only page through new/updated orders. Then cache the result for downstream services so they don’t trigger their own SP-API reads.
Peak behavior isn’t a surprise — the dates are on your calendar. Two weeks out, schedule more frequent reports for high-change datasets and increase queue depth. One week out, raise concurrency for event consumers and pre-scale your worker autoscaling targets.
Make it concrete:
When the world gets hot, your system should choose what to drop. If you hit sustained throttling, pause non-critical syncs (e.g., catalog enrichments) and reserve your budget for revenue-critical flows (orders, fulfillment). Surface graceful fallbacks in dashboards and alert on backlogs, not just error counts.
Playbook during trouble:
Model worst-case traffic. Example: if your median day does 25k orders and Prime Day does 8x, you’ll see ~200k order events. Can your consumers handle that with at-most-once semantics? Can your queues buffer a few hours at peak? Does your rate budget cover retries plus scheduled syncs? Answer those now, not during the spike.
A quick capacity checklist:
And because it will come up: as SP-API usage moves to more explicit cost models (and yes, you’ve seen posts like 'an update on SP-API fees' and the chatter on 'amazon sp-api 2026 fees: how to optimize your …'), the playbook above is the difference between scalable margins and surprise bills.
Switch from polling to event-driven. Subscribe to order, feed, and report events and process from queues. Then move heavy reads to scheduled Reports and heavy writes to Feeds. That two-step change typically slashes call volume while improving freshness.
Respect 429s with exponential backoff and jitter, cap retry attempts, and budget retries using a token-bucket per operation. Deprioritize non-critical jobs when you approach rate ceilings and lean on your cached/report data to answer non-urgent queries.
Yes. Reports give you bulk, point-in-time truth; caching serves that truth quickly to internal services without extra SP-API calls. Invalidate caches via events and scheduled report checkpoints so you stay fresh without spamming the API.
For large batches, yes. Feeds reduce the number of requests and provide predictable processing. For tiny updates that must be real-time, single calls can make sense — but measure the impact on rate limits and costs.
Two weeks out, increase report frequency, queue depth, and autoscaling targets. One week out, run failover tests and add circuit breakers. During the event, pause non-critical syncs to protect your rate budget for orders and fulfillment.
You’ll see discussions like 'diving into amazon sp-api hot topics! 🔥👋 #4270' in dev communities. For authoritative guidance, start with Amazon’s SP-API docs for Notifications, Reports, Feeds, rate limits, and error handling (linked below).
Partition everything by marketplace. That means queues per marketplace (or partition keys), per-marketplace checkpoints, and per-marketplace rate budgets. If one region gets noisy, it shouldn’t starve others.
Cache LWA access tokens with their expiry and refresh them early. Avoid requesting a new token on every call. Centralize token management so internal services don’t each hit the auth endpoints independently.
Choose a size that finishes within your target end-to-end latency while keeping failure blast radius small. Many teams start with 2k–5k records per feed, then tune based on processing times and error rates. Keep the size stable to simplify monitoring.
Expose three metrics: requests per operation, throttles per operation, and report/feed counts per day. Tag each workload by business function so you can spot runaway jobs quickly. Set alerts on unusual growth day-over-day or week-over-week.
What good looks like at the end:
You don’t win peak week by calling more APIs. You win by calling the right ones, at the right time, in the right shape.
In 120 seconds, here’s the mindset shift: your integration isn’t a hose sucking on SP-API — it’s a smart valve. Events trigger action. Reports set the baseline. Feeds apply change in bulk. Retries are calm and patient. Caches absorb read traffic. And your peak plan makes sure the revenue paths stay wide open when the flood hits.
Do this and two things happen: your call volume drops and your experience gets faster. That’s the rare optimization that saves money and makes customers happier.
If you want to see how teams put this playbook into practice, explore our Case Studies.
200,000 tiny polls or 200 useful events? In 2026, your margin knows the difference.