What is Rate Limiting?
Your login endpoint accepts a username and password, checks the database, and returns success or failure. It works. But what happens when someone sends 10,000 requests per minute with different passwords? Without rate limiting, your server answers every single one. The attacker just needs time.
Missing rate limiting isn't just about login pages. Any endpoint that does something expensive or sensitive is a target: password resets, OTP verification, user registration, search queries, AI inference calls. If there's no throttle, attackers will find it and abuse it.
CWE-307 focuses on authentication attempts, but the broader problem (CWE-770: Allocation of Resources Without Limits) applies everywhere. OWASP ranks this under #7: Identification and Authentication Failures. It's the kind of vulnerability that doesn't look like a bug in your code — it's the absence of a control that should be there.
How does Rate Limiting work?
The issue is straightforward: your endpoint processes every request it receives, no matter how many come from the same source. There's no counter, no cooldown, no backoff. An attacker with a script can try every common password against your login in minutes.
Here's a login endpoint with no rate limiting:
// app/api/auth/login/route.ts
export async function POST(req: Request) {
const { email, password } = await req.json();
const user = await db.user.findUnique({ where: { email } });
if (!user || !await bcrypt.compare(password, user.hash)) {
// An attacker can call this 10,000 times per minute
// and try every password in their dictionary.
return Response.json(
{ error: 'Invalid credentials' },
{ status: 401 }
);
}
const token = await createSession(user);
return Response.json({ token });
}// app/api/auth/login/route.ts
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(5, '60 s'), // 5 attempts per minute
});
export async function POST(req: Request) {
const ip = req.headers.get('x-forwarded-for') ?? '127.0.0.1';
const { success } = await ratelimit.limit(ip);
if (!success) {
return Response.json(
{ error: 'Too many attempts. Try again later.' },
{ status: 429 }
);
}
const { email, password } = await req.json();
const user = await db.user.findUnique({ where: { email } });
if (!user || !await bcrypt.compare(password, user.hash)) {
return Response.json(
{ error: 'Invalid credentials' },
{ status: 401 }
);
}
const token = await createSession(user);
return Response.json({ token });
}Why do AI tools generate Rate Limiting vulnerabilities?
Rate limiting is infrastructure, not application logic. When you ask an AI to build a login endpoint, it builds the login — authentication, session creation, error handling. Throttling is a separate concern that the model almost never adds on its own.
- It's not part of the feature. "Build a login API" doesn't mention rate limiting. The model delivers exactly what you asked for — a working login. The absence of throttling isn't a bug in the generated code, it's a missing layer.
- Rate limiting needs external state. You need Redis, a database counter, or an in-memory store. AI models generate self-contained route handlers. Adding a Redis dependency changes the architecture.
- It never fails in testing. You test login once, maybe twice. It works. Rate limiting only matters at scale — something no one tests during a build session.
This is one of the most universal gaps in AI-generated apps. Every login endpoint, every OTP flow, every password reset — generated without throttling. The code is correct. The protection is just missing.
Common Rate Limiting patterns
Login brute force
No limit on /api/auth/login — attackers try thousands of passwords per account using credential-stuffing lists.
OTP/2FA bypass
A 6-digit OTP has 1 million combinations. Without rate limiting, an attacker can try all of them in under an hour.
User enumeration
Hitting /api/auth/forgot-password with every email in a list. Different response times or messages reveal which accounts exist.
Expensive operation abuse
AI inference endpoints, PDF generation, or search queries with no throttle. Attackers can run up your cloud bill or cause denial of service.
How Flowpatrol detects Rate Limiting
Flowpatrol checks for rate limiting the same way an attacker would — by sending a burst of requests and watching what happens:
- 1Identifies auth endpoints by crawling your app and finding login, registration, password reset, and OTP verification routes.
- 2Sends rapid-fire requests with invalid credentials — 20+ attempts in quick succession against the same endpoint.
- 3Checks for 429 responses or other throttling signals like Retry-After headers, increasing delays, or CAPTCHA challenges.
- 4Reports unprotected endpoints with the request count that succeeded, the endpoint path, and recommended rate limits for that type of operation.
If your login endpoint happily processes 50 failed attempts in 10 seconds, Flowpatrol flags it. That's the difference between a working app and a production-ready one.
Related terms
Check your app for missing rate limits.
Flowpatrol tests your auth endpoints with real brute-force patterns. Paste your URL and find out in five minutes.
Try it free