AI QA Monkey
AI Security Intelligence
Security Guide

API Rate Limiting Guide: Throttling & Abuse Prevention

Without rate limiting, your API is an open target. Attackers can brute-force authentication endpoints, scrape your data, exhaust server resources, and run up your cloud bill — all with simple scripts making thousands of requests per second.

In our analysis of APIs scanned with AI QA Monkey, 62% had no rate limiting on authentication endpoints, and 45% returned no rate limit headers, making it impossible for legitimate clients to self-throttle.

Why Rate Limiting Matters

  • Prevents brute-force attacks — limits login attempts per IP/account
  • Mitigates DDoS — absorbs traffic spikes without crashing
  • Prevents data scraping — makes bulk extraction impractical
  • Ensures fair usage — no single client monopolizes resources
  • Controls costs — prevents runaway cloud bills from abuse
  • Compliance requirement — PCI DSS and OWASP API Security Top 10 require rate limiting

Rate Limiting Algorithms

Fixed Window

Count requests in fixed time intervals (e.g., 100 requests per minute). Simple but allows burst at window boundaries.

Sliding Window

Smooths the fixed window by considering the overlap between current and previous windows. More accurate but slightly more complex.

Token Bucket

Tokens are added to a bucket at a fixed rate. Each request consumes a token. Allows controlled bursts while maintaining average rate.

Leaky Bucket

Requests enter a queue (bucket) and are processed at a fixed rate. Excess requests overflow and are rejected. Ensures constant output rate.

Express.js Implementation CRITICAL

Basic Rate Limiting

// npm install express-rate-limit
const rateLimit = require('express-rate-limit');

// General API rate limit
const apiLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 100,                   // 100 requests per window
  standardHeaders: true,      // Return RateLimit-* headers
  legacyHeaders: false,       // Disable X-RateLimit-* headers
  message: {
    error: 'Too many requests, please try again later.',
    retryAfter: 900
  }
});

app.use('/api/', apiLimiter);

Stricter Limits for Authentication

// Strict rate limit for login endpoint
const loginLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 5,                     // 5 attempts per window
  skipSuccessfulRequests: true, // Don't count successful logins
  message: {
    error: 'Too many login attempts. Please try again in 15 minutes.'
  }
});

app.post('/api/login', loginLimiter, loginHandler);
app.post('/api/register', loginLimiter, registerHandler);
app.post('/api/forgot-password', loginLimiter, forgotPasswordHandler);

Redis Store for Production (Distributed)

// npm install rate-limit-redis ioredis
const RedisStore = require('rate-limit-redis');
const Redis = require('ioredis');

const redisClient = new Redis({
  host: process.env.REDIS_HOST,
  port: 6379,
  enableOfflineQueue: false
});

const distributedLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.call(...args),
  }),
});

// This works across multiple server instances
app.use('/api/', distributedLimiter);

Nginx Rate Limiting HIGH

# Define rate limit zones in http {} block
http {
    # 10 requests per second per IP
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

    # Stricter limit for auth endpoints
    limit_req_zone $binary_remote_addr zone=auth:10m rate=1r/s;

    # Per-API-key rate limiting
    map $http_x_api_key $api_key {
        default $binary_remote_addr;
        "~.+" $http_x_api_key;
    }
    limit_req_zone $api_key zone=apikey:10m rate=30r/s;
}

server {
    # General API endpoints — allow burst of 20
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        limit_req_status 429;
        proxy_pass http://backend;
    }

    # Auth endpoints — strict, no burst
    location /api/auth/ {
        limit_req zone=auth burst=3 nodelay;
        limit_req_status 429;
        proxy_pass http://backend;
    }
}

Standard Rate Limit Headers HIGH

# Response headers your API should return:

# Successful request (within limits)
HTTP/1.1 200 OK
RateLimit-Limit: 100          # Max requests in window
RateLimit-Remaining: 87       # Requests remaining
RateLimit-Reset: 1708534800   # Unix timestamp when window resets

# Rate limited request
HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1708534800
Retry-After: 900              # Seconds to wait
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please retry after 900 seconds.",
  "retryAfter": 900
}

GraphQL Rate Limiting

GraphQL is harder to rate limit because a single request can fetch vastly different amounts of data. Use query complexity analysis:

// npm install graphql-query-complexity
const { createComplexityLimitRule } = require('graphql-query-complexity');

const server = new ApolloServer({
  schema,
  validationRules: [
    createComplexityLimitRule(1000, {  // Max complexity score
      onCost: (cost) => {
        console.log('Query complexity:', cost);
      },
      formatErrorMessage: (cost) =>
        `Query too complex: ${cost}. Maximum allowed: 1000.`
    })
  ]
});

// Also limit query depth
// npm install graphql-depth-limit
const depthLimit = require('graphql-depth-limit');

const server = new ApolloServer({
  schema,
  validationRules: [
    depthLimit(10),  // Max nesting depth
    createComplexityLimitRule(1000)
  ]
});

Cloud Provider Rate Limiting

AWS API Gateway

# AWS API Gateway throttling settings
# Account-level: 10,000 requests/second (default)
# Per-method: configurable

# Set via AWS CLI
aws apigateway update-stage \
  --rest-api-id abc123 \
  --stage-name prod \
  --patch-operations \
    op=replace,path=/throttling/rateLimit,value=1000 \
    op=replace,path=/throttling/burstLimit,value=500

# Per-API-key usage plan
aws apigateway create-usage-plan \
  --name "Standard" \
  --throttle burstLimit=100,rateLimit=50 \
  --quota limit=10000,period=DAY

Cloudflare Rate Limiting

# Cloudflare Dashboard > Security > WAF > Rate limiting rules

# Example rule:
# If: URI Path contains "/api/"
# And: Request rate exceeds 100 requests per 10 seconds
# Then: Block for 60 seconds
# With response: 429 Too Many Requests

Advanced Strategies

  • Tiered rate limits — different limits for free vs. paid API keys
  • Endpoint-specific limits — stricter on auth, looser on read-only endpoints
  • User-based + IP-based — combine both to prevent account sharing abuse
  • Exponential backoff — increase wait time with repeated violations
  • CAPTCHA escalation — serve CAPTCHA after rate limit instead of hard block
  • Allowlisting — exempt trusted partners and monitoring services

Laravel rate limiting: throttle middleware

Laravel provides built-in rate limiting via the throttle middleware and the RateLimiter facade. Laravel 8+ supports named rate limiters with flexible configuration.

// app/Providers/RouteServiceProvider.php (or bootstrap/app.php in L11+)
use Illuminate\Support\Facades\RateLimiter;
use Illuminate\Http\Request;

RateLimiter::for('api', function (Request $request) {
    return Limit::perMinute(60)->by($request->user()?->id ?: $request->ip());
});

RateLimiter::for('login', function (Request $request) {
    return [
        Limit::perMinute(5)->by($request->ip()),
        Limit::perMinute(10)->by($request->input('email')),
    ];
});

// Apply in routes/api.php:
Route::middleware(['throttle:api'])->group(function () {
    Route::get('/user', [UserController::class, 'show']);
});

Route::post('/login', [AuthController::class, 'login'])
    ->middleware('throttle:login');

// For 429 responses, Laravel automatically returns:
// Retry-After header and X-RateLimit-* headers

Distributed rate limiting with Redis

Single-server rate limiting breaks in horizontally scaled environments. When multiple app servers handle requests, each server maintains its own counter — a client can bypass the limit by distributing requests across servers. Redis-backed rate limiting solves this by maintaining shared counters.

// Express.js with Redis store (ioredis + rate-limit-redis)
const rateLimit = require('express-rate-limit');
const RedisStore = require('rate-limit-redis');
const Redis = require('ioredis');

const client = new Redis({ host: process.env.REDIS_HOST });

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  standardHeaders: true,
  store: new RedisStore({
    sendCommand: (...args) => client.call(...args),
    prefix: 'rl:',  // Namespace in Redis
  }),
  keyGenerator: (req) => req.user?.id || req.ip,  // Per-user when authenticated
});

// Laravel: Redis is used automatically when CACHE_DRIVER=redis
// The throttle middleware uses the cache store — set Redis in .env:
// CACHE_DRIVER=redis
// REDIS_HOST=127.0.0.1

Rate limit response format: what to return on 429

A well-formed 429 response helps legitimate clients retry correctly and debugging significantly easier. The response must include enough information for clients to self-manage their request rate.

// Correct 429 response structure
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1736000000
X-RateLimit-Policy: "100;w=900"

{
  "error": "rate_limit_exceeded",
  "message": "You have exceeded the request limit. Please wait before retrying.",
  "retry_after": 60,
  "limit": 100,
  "window_seconds": 900
}

// Key rules:
// - Use 429, not 403 (forbidden implies permanent denial, not temporary)
// - Include Retry-After as seconds (not HTTP date) for simplicity
// - Include RateLimit-* headers per IETF draft-ietf-httpapi-ratelimit-headers
// - Log the violation server-side with IP, user ID, endpoint, and timestamp

Detecting and blocking rate limit evasion

Sophisticated attackers evade IP-based rate limits by distributing requests across multiple IPs using residential proxies or botnets. Layered detection is required to handle these patterns.

  • Device fingerprinting: Combine IP with TLS fingerprint (JA3/JA4), User-Agent, and Accept-Language for a more durable client identity. Tools: ja3 via Nginx or Cloudflare Bot Management.
  • Behavioral anomaly detection: A human browsing at 10 requests/second is impossible. Set an absolute maximum (e.g., 20 req/s regardless of auth state) that triggers immediate temporary block, logged for investigation.
  • Account-level rate limits: In addition to IP limits, apply per-account limits. A compromised account used for credential stuffing generates unusual patterns even across different IPs.
  • CAPTCHA for borderline cases: Instead of hard-blocking at the rate limit, serve a CAPTCHA challenge for requests that hit 80% of the limit. Legitimate users pass; automated scripts fail.
  • WAF integration: Configure WAF rules (Cloudflare, AWS WAF, Nginx ModSecurity) to complement application-level rate limiting. WAF-level limiting scales better under high-volume attacks.

Scan Your API Security

Free scan — detect missing rate limiting, exposed endpoints, CORS issues, and API misconfigurations.

Scan API Security Now

Frequently Asked Questions

What is API rate limiting?

API rate limiting restricts the number of requests a client can make within a given time window. It prevents abuse, protects resources, and ensures fair usage.

How do I implement rate limiting in Express.js?

Use the express-rate-limit package. For production with multiple servers, add a Redis store for distributed rate limiting.

What rate limit headers should my API return?

Return RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, and Retry-After (on 429 responses) following the IETF standard.

How do I rate limit a GraphQL API?

Use query complexity analysis to assign costs to fields and limit total complexity per request. Also limit query depth and breadth to prevent resource exhaustion.

Check Your Website Right Now

Run a free automated security scan — 75 checks in 60 seconds. No signup required.

Run Free Security Scan →