Skip to content

Rate Limits

The v1 API enforces two layers of rate limiting:

  1. A tenant-wide cost-weighted budget, sliding over a 1-hour window.
  2. A per-API-key per-minute limit, defaulting to 60 requests/min (configurable per key).

A request is admitted only if both layers have headroom. When either is exceeded, you get 429 rate_limited.

Requests are not all equal. Cheap reads cost 1 unit; expensive operations cost more so a single PDF render or bulk job can’t drain the entire budget on its own.

OperationCost
GET (list, get, introspect)1
Standard mutations (POST, PATCH, PUT, DELETE)5
Endpoints ending in /exports20
Endpoints ending in /pdf50
Endpoints ending in /bulk100
Endpoints ending in /imports200

The tenant budget is 10,000 units per hour, sliding. With pure reads you can sustain a ~3 req/sec average; with bulk imports you’ll consume the budget much faster.

Every operation in the OpenAPI spec carries an x-rate-limit-cost extension that pins the exact cost — code generators and explorers can surface it next to each route.

Every API key gets a per-minute sliding limit — 60/min by default. This protects against a runaway script monopolizing the tenant budget. Limits can be raised per key for high-volume integrations — contact support if you need a bump.

Both limiters publish their state on every successful response. They use distinct header namespaces so you can tell the two signals apart without ambiguity:

HeaderMeaning
RateLimit-Tenant-LimitTenant-wide hourly budget (units).
RateLimit-Tenant-RemainingUnits left in the current hour.
RateLimit-Tenant-ResetSeconds until the tenant window resets.
RateLimit-Key-LimitPer-API-key per-minute request budget.
RateLimit-Key-RemainingRequests left in the current minute on this key.
RateLimit-Key-ResetSeconds until the per-key window resets.

On 429, we also send a Retry-After header (seconds) — honor it. The Retry-After value is the reset time of whichever bucket triggered the throttle, so it tells you exactly how long to sleep before the next attempt has any chance of succeeding.

{
"type": "https://docs.buildworkpro.com/docs/api/concepts/errors#rate_limited",
"title": "Rate limited",
"status": 429,
"detail": "Tenant hourly budget exceeded",
"code": "rate_limited",
"request_id": "req_a4d8c0e2"
}

The headers on a 429 (tenant budget exhausted):

RateLimit-Tenant-Limit: 10000
RateLimit-Tenant-Remaining: 0
RateLimit-Tenant-Reset: 1742
Retry-After: 1742

To surface live consumption in a developer dashboard — or to back off proactively before hitting 429 — call GET /api/v1/usage. It returns the current tenant + per-key counters straight from the rate-limit store:

{
"data": {
"tenant": {
"limit": 10000,
"used": 4280,
"remaining": 5720,
"resetSeconds": 1432,
"windowSeconds": 3600
},
"apiKey": {
"limit": 60,
"used": 12,
"remaining": 48,
"resetSeconds": 23,
"windowSeconds": 60
}
}
}

apiKey is null when the caller is OAuth-authenticated (which doesn’t have a per-key sub-limit).

  • Honor Retry-After. Sleep for at least that many seconds before retrying. If the header is absent, default to 30 seconds.
  • Use exponential backoff with full jitter for transient failures. A simple recipe: delay = random(0, min(cap, base * 2^attempt)), with base = 1s and cap = 60s.
  • Watch RateLimit-Tenant-Remaining and RateLimit-Key-Remaining during normal operation. If either is trending toward zero, slow down before you get throttled.
  • Cache aggressively for repeated reads. The cheapest call is one you don’t make.