69 Security Vulnerabilities Found in Top AI Coding Tools - What This Means for Your App
Security startup Tenzai just published the most comprehensive security audit of AI coding tools to date. They tested Cursor, Claude Code, Replit, OpenAI Codex, and Devin - and every single one shipped code with exploitable vulnerabilities.
On January 13, 2026, Tenzai released their research paper "Bad Vibes: Comparing the Secure Coding Capabilities of Popular Coding Agents." The findings should concern anyone shipping vibe-coded applications to real users.
This isn't theoretical. These are real, exploitable security flaws in code that AI tools generate for common use cases. Let's break down what they found.
The Test Setup
Tenzai gave each AI coding tool identical prompts to build three applications: a link preview feature, a user profile system, and a file upload handler. These aren't edge cases - they're everyday features in most web applications.
The tools tested were:
- Cursor - The popular VS Code-based AI coding assistant
- Claude Code - Anthropic's autonomous coding agent
- Replit Agent - Replit's AI pair programmer
- OpenAI Codex - The model behind GitHub Copilot
- Devin - Cognition's AI software engineer
The Results: Nobody Passed
Here's the breakdown by tool:
Claude Code - 16 vulnerabilities (4 critical)
Anthropic's autonomous agent had the most security flaws. The critical issues included authentication bypasses and data exposure vulnerabilities.
Highest vulnerability countCursor - 13 vulnerabilities
Despite its enterprise adoption and $1B+ revenue, Cursor's generated code contained exploitable flaws in all three test applications.
Replit Agent - 13 vulnerabilities
The tool powering Replit's $9B valuation round produced the same vulnerability count as Cursor.
OpenAI Codex - 14 vulnerabilities
The foundation of GitHub Copilot showed similar security gaps.
Devin - 13 vulnerabilities
Cognition's "AI software engineer" matched the others in security failures.
What Types of Flaws?
The study found specific patterns in what AI tools get wrong:
Server-Side Request Forgery (SSRF) - 100% failure rate
Every single tool produced code vulnerable to SSRF when building the link preview feature. This attack lets malicious users make your server request internal resources or external services on their behalf - potentially accessing databases, cloud metadata services, or other systems that should be inaccessible.
Authorization Logic - Consistently broken
AI tools consistently failed to implement proper authorization checks. The pattern: they check if a user is logged in, but not if that user should have access to the specific resource they're requesting. This is the "BOLA" vulnerability (Broken Object Level Authorization) that tops the OWASP API Security Top 10.
Input Validation - Insufficient
While AI tools did catch some injection vulnerabilities (SQL injection and XSS were largely avoided), they failed at business logic validation. The code often trusts that inputs will be in expected formats without validating edge cases.
The Good News (Sort Of)
There were some areas where AI tools performed better than expected:
- SQL Injection - Most tools used parameterized queries correctly
- Cross-Site Scripting (XSS) - Basic output encoding was usually present
- Password Storage - Hashing was generally implemented (though not always with optimal algorithms)
This suggests AI tools have learned from the most common, well-documented vulnerability patterns. The problem is everything else.
Why This Matters
The Tenzai study confirms what we've seen in our own audits: AI coding tools are great at producing code that works, but consistently fail at producing code that's secure.
The Pattern: AI tools excel at functional requirements but struggle with non-functional requirements like security, performance, and maintainability. They write code that does what you asked - but not code that safely handles all the ways bad actors might abuse it.
This isn't surprising when you consider how these tools are trained. They learn from existing code, much of which has the same security gaps. And prompts typically describe what the code should do, not all the ways it might be attacked.
The Compounding Problem
Here's what makes this worse: many vibe coders don't have the security expertise to catch these issues in code review. The whole point of vibe coding is that you can build without deep technical knowledge.
So you have a perfect storm:
- AI tools that consistently produce insecure code
- Users who can't recognize the vulnerabilities
- Pressure to ship fast without security review
- Real user data at risk once deployed
We've documented the most common issues in our 5 Security Holes I Find in Every Vibe-Coded App post. The Tenzai study validates that these aren't edge cases - they're the norm.
What To Do About It
If you've built (or are building) an application with AI coding tools, here's the reality check:
1. Assume Your Code Has Security Issues
Based on this study, there's roughly a 100% chance your AI-generated code has at least some security vulnerabilities. Don't assume it's safe because it works.
2. Focus on Authorization and Access Control
AI tools consistently fail at authorization logic. For every API endpoint and database query, ask: "Am I checking that this specific user should access this specific resource?" The AI probably isn't.
3. Don't Trust User-Controlled URLs
Any feature that fetches external resources (link previews, image proxies, webhooks) is a SSRF risk. Implement allowlists for domains and block internal IP ranges.
4. Get Human Review Before Launch
The Tenzai study makes it clear: AI tools aren't a replacement for security expertise. If you're handling user data, you need someone who understands security to review what the AI produced.
Check Your App Before You Ship
We've compiled the 10 most critical security checks for vibe-coded apps into a free checklist. These are the exact issues the Tenzai study found - and the fixes are often straightforward if you know to look.
Get the Free Security ChecklistThe Bigger Picture
The Tenzai study arrives at an inflection point for AI coding. Cursor just passed $1B in annual revenue. Replit is raising at a $9B valuation. Claude Code is being called an "infinite vibe coding machine."
These tools are going mainstream, fast. And every month, more applications built with them are going into production with real user data.
The tools aren't going away - nor should they. AI-assisted coding genuinely accelerates development. But the industry needs to grapple with the security gap these tools create.
For now, the responsibility falls on individual developers and founders to catch what AI misses. That's not ideal, but it's reality.
What We're Watching
Several developments could improve this situation:
- Security-focused AI tools - Startups like Cencori (from Nigeria) are building automated security scanners specifically for vibe-coded applications
- Better prompting - Research into security-aware prompts that guide AI to consider attack vectors
- Integrated security checks - Tools that review AI-generated code before it's committed
- Education - As vibe coding matures, security literacy needs to follow
Until these mature, the burden remains on you to verify that your AI-generated code is safe to ship.
Bottom Line
The Tenzai study provides concrete data for what many of us in security have been saying: AI coding tools are powerful, but they systematically produce insecure code.
This doesn't mean you shouldn't use them. It means you should use them with eyes open, and always get security review before exposing your application to real users.
The 69 vulnerabilities Tenzai found weren't in edge cases or adversarial prompts. They were in basic web application features that millions of developers are building right now.
Check your code before your users find out the hard way.
Need Help Securing Your Vibe-Coded App?
Book a free 30-minute consultation. We'll review your application, identify the highest-risk vulnerabilities, and give you a clear picture of what needs to be fixed before launch.
Book Free Consultation