Overview

Rate limits control how many API requests you can make per hour. They prevent abuse, ensure fair usage, and maintain platform stability for all users.

Rate Limit Tiers

TierRequests/HourBurst LimitVideos/Month
Free1001010
Starter1,00050100
Pro5,000100500
Enterprise50,0005003,000+
Burst Limit allows short bursts of requests exceeding the average rate. Useful for batch operations.

How Rate Limiting Works

Sliding Window

Bluma uses a sliding window algorithm:
Time:     10:00   10:15   10:30   10:45   11:00
Requests:  [250] → [180] → [220] → [200] → [150]
Limit:     1000/hour

At 10:45:
  Requests in last hour = 180 + 220 + 200 = 600
  Remaining = 1000 - 600 = 400 ✓
This is more accurate than fixed windows and allows for smoother usage patterns.

Rate Limit Headers

Every API response includes rate limit information:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 987
X-RateLimit-Reset: 1699127600
HeaderDescription
X-RateLimit-LimitMaximum requests allowed per hour
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when limit resets

Reading Headers

const response = await fetch(url, options);

const rateLimit = {
  limit: parseInt(response.headers.get('X-RateLimit-Limit')),
  remaining: parseInt(response.headers.get('X-RateLimit-Remaining')),
  reset: parseInt(response.headers.get('X-RateLimit-Reset'))
};

console.log(`${rateLimit.remaining}/${rateLimit.limit} requests remaining`);

// Check if approaching limit
if (rateLimit.remaining < 10) {
  console.warn('Approaching rate limit!');
  // Slow down or pause requests
}

Rate Limit Exceeded (429)

When you exceed your rate limit, you’ll receive a 429 Too Many Requests response:
{
  "error": {
    "type": "rate_limit_exceeded",
    "title": "Rate Limit Exceeded",
    "status": 429,
    "detail": "You have exceeded the rate limit of 1,000 requests per hour.",
    "metadata": {
      "limit": 1000,
      "retry_after": 3600,
      "current_usage": 1000
    },
    "links": {
      "docs": "https://docs.getbluma.com/concepts/rate-limits",
      "upgrade": "https://getbluma.com/billing"
    }
  }
}
Additional Header:
Retry-After: 3600
The Retry-After header indicates how many seconds to wait before retrying.

Handling Rate Limits

1. Exponential Backoff

async function apiCallWithBackoff(url: string, options: RequestInit, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      const backoffTime = Math.min(retryAfter * 1000, Math.pow(2, attempt) * 1000);

      console.log(`Rate limited. Retrying in ${backoffTime}ms...`);
      await new Promise(resolve => setTimeout(resolve, backoffTime));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

2. Request Queue

class RateLimitedQueue {
  private queue: Array<() => Promise<any>> = [];
  private processing = false;
  private requestsThisHour = 0;
  private hourStart = Date.now();
  private limit = 1000;

  async enqueue<T>(fn: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      this.queue.push(async () => {
        try {
          const result = await fn();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      });
      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;

    while (this.queue.length > 0) {
      // Reset counter if hour has passed
      if (Date.now() - this.hourStart > 3600000) {
        this.requestsThisHour = 0;
        this.hourStart = Date.now();
      }

      // Check if under limit
      if (this.requestsThisHour >= this.limit) {
        const waitTime = 3600000 - (Date.now() - this.hourStart);
        console.log(`Rate limit reached. Waiting ${waitTime}ms...`);
        await new Promise(resolve => setTimeout(resolve, waitTime));
        continue;
      }

      // Process next request
      const task = this.queue.shift();
      if (task) {
        await task();
        this.requestsThisHour++;
      }

      // Small delay between requests
      await new Promise(resolve => setTimeout(resolve, 100));
    }

    this.processing = false;
  }
}

// Usage
const queue = new RateLimitedQueue();

for (let i = 0; i < 100; i++) {
  queue.enqueue(() => fetch(url, options));
}

3. Monitoring Usage

async function monitorRateLimit(response: Response) {
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0');
  const limit = parseInt(response.headers.get('X-RateLimit-Limit') || '0');
  const reset = parseInt(response.headers.get('X-RateLimit-Reset') || '0');

  const percentUsed = ((limit - remaining) / limit) * 100;

  if (percentUsed > 90) {
    console.error('⚠️ CRITICAL: 90%+ of rate limit used!');
    // Alert, slow down, or pause
  } else if (percentUsed > 75) {
    console.warn('⚠️ WARNING: 75%+ of rate limit used');
  }

  // Log to monitoring service
  metrics.gauge('api.rate_limit.remaining', remaining);
  metrics.gauge('api.rate_limit.percent_used', percentUsed);
}

Per-Key vs Account-Wide

Rate limits are applied per API key, not per account. This allows you to:
  • Create separate keys for different applications
  • Isolate production from development traffic
  • Scale horizontally with multiple keys

Example: Multiple Keys

// Production key (high traffic)
const prodKey = 'bluma_live_prod_key';

// Background jobs key (batch operations)
const batchKey = 'bluma_live_batch_key';

// Development key (testing)
const devKey = 'bluma_test_dev_key';
Each key has its own independent rate limit.

Upgrading Limits

Increase Your Tier

Higher tiers get higher rate limits:
Free Starter:    100 1,000 req/hr (10x)
Starter Pro:     1,000 5,000 req/hr (5x)
Pro Enterprise:  5,000 50,000 req/hr (10x)
Upgrade at getbluma.com/billing

Custom Limits

Enterprise customers can request custom rate limits based on their specific needs. Contact sales@getbluma.com.

Best Practices

Check Headers

Monitor rate limit headers and adjust request rate dynamically

Implement Backoff

Use exponential backoff when receiving 429 responses

Cache Responses

Cache frequently accessed data (templates list, etc.) to reduce API calls

Batch Operations

Combine multiple operations when possible to reduce request count

Exceptions

Rate limits do not apply to: ✅ Webhook deliveries (server-initiated) ✅ OAuth token refresh (authentication) ✅ Health check endpoints Rate limits do apply to: ❌ All /v1/* API endpoints ❌ OpenAPI spec endpoint (/v1/openapi.json)

Testing Rate Limits

Simulate Rate Limiting

Test your backoff logic using test keys with artificially low limits:
curl -X POST https://api.getbluma.com/api/v1/api-keys \
  -H "Authorization: Bearer YOUR_SESSION_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Rate Limit Test Key",
    "environment": "test",
    "rate_limit_per_hour": 10
  }'
Then make >10 requests to trigger rate limiting.

Frequently Asked Questions

Yes! Upgrade your tier or contact sales@getbluma.com for custom limits (Enterprise only).
No, rate limits use a sliding window. They reset continuously based on your request pattern.
Every HTTP request to /v1/* endpoints counts, regardless of success or failure.
Yes, test keys have the same rate limits as production keys of your tier. This helps you test rate limit handling logic.
Bluma currently doesn’t support WebSockets. All communication is via HTTP REST API.

Troubleshooting

Issue: Constant 429 Errors

Causes:
  • Making too many requests too quickly
  • Multiple API keys from same account hitting shared limit
  • Batch operations without rate limiting
Solutions:
  • Implement request queueing
  • Add delays between requests
  • Upgrade to a higher tier
  • Use exponential backoff

Issue: Unexpected Rate Limit

Causes:
  • Previous requests in the sliding window
  • Shared API key across multiple services
  • Clock skew in reset time calculation
Solutions:
  • Check X-RateLimit-Remaining header
  • Use separate API keys per service
  • Monitor usage in dashboard

Monitoring

Track rate limit metrics in your usage dashboard:
  • Current usage vs limit
  • Historical rate limit hits
  • Per-key usage breakdown
  • Average requests per hour

Next Steps