Back to Blog
SaaS

Usage-Based Billing: The Technical Implementation Nobody Warns You About

Zyptr Admin
20 May 2024
9 min read

Why Everyone Wants Usage-Based Pricing

Usage-based pricing is having a moment. Snowflake, Twilio, AWS — the most successful infrastructure companies charge based on usage. So every SaaS startup we work with wants it too. And the business model makes sense: customers pay for what they use, reducing the barrier to adoption. But the technical implementation is significantly harder than seat-based pricing, and most teams underestimate this.

We've built usage-based billing systems for three products: an API platform (billed per API call), an AI document processing tool (billed per document page), and a monitoring service (billed per check and per alert). Here's what we learned.

Metering Is the Hard Part

The billing calculation is easy: quantity times price equals charge. The hard part is accurately counting the quantity. For the API platform, we need to count every API call, categorize it by endpoint tier (different endpoints have different costs), handle failed requests (do you charge for 500 errors?), and aggregate this into billable units. At 50 million API calls per month, the metering pipeline processes roughly 20 events per second, and every one of them needs to be counted exactly once.

Our metering architecture: API requests emit events to a Kafka topic (we used Amazon MSK). A consumer service reads events, aggregates them into minute-level buckets in Redis (using HINCRBY for atomic increments), and a scheduled job flushes minute buckets to PostgreSQL hourly. The hourly aggregates are the "source of truth" for billing. We chose this multi-level aggregation (real-time in Redis, hourly in PostgreSQL) because querying raw events at billing time would be too slow, and we need the real-time counts in Redis for customer-facing usage dashboards.

The Exactly-Once Problem

In distributed systems, ensuring each event is counted exactly once is notoriously hard. Network retries can cause duplicate events. Consumer restarts can cause events to be reprocessed. We handle this with idempotency keys on events (each API call has a unique request ID) and deduplication in the consumer (we check against a Redis set of recently processed event IDs before incrementing counters). The deduplication window is 5 minutes — long enough to catch retries, short enough to keep the Redis set manageable.

Despite these precautions, we've had two incidents where metering was inaccurate. One was a Kafka consumer rebalance that caused 30 seconds of events to be processed twice (the deduplication window handled most but not all duplicates). The other was a Redis failover that lost about 2 minutes of in-memory counts. Both required manual reconciliation. The lesson: build a reconciliation tool from day one that can reprocess raw events and compare against aggregated counts.

Billing Periods and Invoicing

Usage-based billing introduces complexity that flat-rate subscriptions don't have. When do you calculate the bill? At the end of the billing period? But then the customer doesn't know their bill until it's due. In real-time? But then you need a running total that's always accurate. We show customers a real-time usage estimate (from the Redis counters) with a disclaimer that the final bill may differ slightly. The actual invoice is calculated from the PostgreSQL aggregates at the end of the billing period.

We also implement spending limits and alerts. Customers can set a monthly spending cap, and when their usage reaches 80% of the cap, they get an email. At 100%, the behavior depends on the product — the API platform returns 429 (rate limited), the monitoring service continues but downgrades to reduced frequency. The spending limit check runs against the Redis real-time counters, not the PostgreSQL aggregates, to ensure timely enforcement.

Stripe Integration for Usage Billing

Stripe supports usage-based billing natively via their Metered Billing feature. You report usage to Stripe via the Usage Records API, and Stripe calculates the bill at the end of the billing period. This works well but has a nuance: usage records are immutable. If you report 1,000 API calls on day one and later discover it should have been 950 (duplicate event), you can't correct it. You have to report a negative adjustment on a subsequent day. We learned this after over-billing a customer by $45 and spending an hour figuring out how to fix it.

For products with complex pricing tiers (first 10K calls free, next 90K at $0.001, next 900K at $0.0005), we calculate the bill ourselves and use Stripe's Invoice Items API to create a single line item. This gives us full control over the billing logic and avoids Stripe's tier calculation limitations.

usage-billingmeteringstripesaas-pricing
Let's Work Together

Have a Project in Mind?
Great?

Let's talk about building your next product.