Your API scales. That's the pitch. What you didn't notice is that it scales the attacker's costs too: one anonymous user can now mint you a five-figure AWS bill with a bash loop. The endpoint works exactly as designed. The design is what's missing.
Every API call consumes something: CPU, memory, database time, bandwidth, an LLM token bill, a carrier SMS fee. Unrestricted Resource Consumption is what happens when the API has no idea how much any request costs and no mechanism to stop a caller from making too many.
What your AI actually built
You asked for a /api/search endpoint that takes a query and a limit. The model delivered. No upper bound on the limit, no pagination ceiling, no cap on concurrent calls, no rate limit per caller.
Somewhere else in the app there is a /api/send-otp route that calls Twilio for every request, an /api/chat route that forwards to OpenAI with no token cap, and an /api/export route that materializes the entire database into a CSV on the fly.
Each of these is one feature the model wrote correctly. What it did not add — because you did not ask — is any notion of how much any of them costs to invoke and who is allowed to invoke them how many times.
How it gets exploited
An attacker finds the API surface of a SaaS product from its public docs. No login required for a few endpoints, login required for the rest.