article

Pay-Per-Use Metering in Python

4 min read

Pay-Per-Use Metering in Python

Building a SaaS product without usage metering is like running a restaurant where everyone pays the same flat fee regardless of what they order. Your heaviest users subsidize your lightest ones, your pricing feels arbitrary, and you have no visibility into how your product actually gets consumed.

The World Without Metering

Before implementing proper metering, most Python applications track usage through crude approximations. Maybe you count API requests in a database column that gets updated on every call. Perhaps you run nightly batch jobs that tally up storage consumption. Or worse, you simply trust that your pricing tiers roughly match actual usage patterns.

This approach creates specific problems. Database writes on every API call introduce latency and create contention under load. Batch processing means your usage data is always stale - a user could blow through their quota hours before you notice. And without granular event data, you cannot offer the flexible pricing models that enterprise customers increasingly expect.

The billing conversation with customers becomes adversarial. They dispute charges because your aggregated numbers do not match their internal tracking. You lack the raw event data to reconcile differences.

Event-Driven Metering Architecture

The shift to proper metering centres on treating every billable action as an immutable event. Instead of incrementing counters, you emit structured events that flow through a pipeline designed for high-throughput ingestion.

from dataclasses import dataclass
from datetime import datetime
import json

@dataclass
class UsageEvent:
    customer_id: str
    event_type: str
    quantity: float
    timestamp: datetime
    idempotency_key: str
    properties: dict

    def to_dict(self):
        return {
            "customer_id": self.customer_id,
            "event_type": self.event_type,
            "quantity": self.quantity,
            "timestamp": self.timestamp.isoformat(),
            "idempotency_key": self.idempotency_key,
            "properties": self.properties
        }

The idempotency key matters critically. Network failures happen, retries occur, and without deduplication you will double-bill customers. Generate keys deterministically from the operation being metered.

import hashlib

def generate_idempotency_key(customer_id: str, operation: str, resource_id: str) -> str:
    content = f"{customer_id}:{operation}:{resource_id}"
    return hashlib.sha256(content.encode()).hexdigest()[:32]

Buffered Emission with Background Flushing

Emitting events synchronously on every API call reintroduces the latency problem. A buffered approach collects events in memory and flushes them periodically or when the buffer reaches capacity.

import threading
import time
from collections import deque

class MeteringClient:
    def __init__(self, flush_interval: float = 5.0, max_buffer_size: int = 100):
        self._buffer = deque()
        self._lock = threading.Lock()
        self._flush_interval = flush_interval
        self._max_buffer_size = max_buffer_size
        self._start_background_flush()

    def track(self, event: UsageEvent):
        with self._lock:
            self._buffer.append(event)
            if len(self._buffer) >= self._max_buffer_size:
                self._flush()

    def _flush(self):
        events = []
        while self._buffer:
            events.append(self._buffer.popleft())
        if events:
            self._send_batch(events)

    def _send_batch(self, events: list):
        # Send to your metering backend - Stripe, Orb, or custom
        payload = [e.to_dict() for e in events]
        # Implementation depends on your billing provider

Services like Orb and Stripe Billing provide managed metering backends that handle aggregation, deduplication, and invoice generation. For simpler needs, Redis Streams offer a self-hosted option with built-in consumer groups.

The After State

With event-driven metering in place, your billing infrastructure transforms. Customer disputes resolve in minutes because you have the raw event log. Usage dashboards show real-time consumption rather than yesterday’s approximations. New pricing models - per-seat, per-API-call, tiered volume discounts - become configuration changes rather than engineering projects.

Your API endpoints shed the latency of synchronous database writes. The metering client adds microseconds, not milliseconds. And when a customer asks exactly how their bill was calculated, you can trace it back to individual events with timestamps and context.

The infrastructure investment is modest. The payoff is a billing system that customers trust and a pricing model flexible enough to evolve with your product.