January 2026•10 min read

69 Security Vulnerabilities Found in Top AI Coding Tools - What This Means for Your App

Security startup Tenzai just published the most comprehensive security audit of AI coding tools to date. They tested Cursor, Claude Code, Replit, OpenAI Codex, and Devin - and every single one shipped code with exploitable vulnerabilities.

On January 13, 2026, Tenzai released their research paper "Bad Vibes: Comparing the Secure Coding Capabilities of Popular Coding Agents." The findings should concern anyone shipping vibe-coded applications to real users.

Total vulnerabilities found across 5 AI coding tools in 15 test applications

This isn't theoretical. These are real, exploitable security flaws in code that AI tools generate for common use cases. Let's break down what they found.

The Test Setup

Tenzai gave each AI coding tool identical prompts to build three applications: a link preview feature, a user profile system, and a file upload handler. These aren't edge cases - they're everyday features in most web applications.

The tools tested were:

Cursor - The popular VS Code-based AI coding assistant
Claude Code - Anthropic's autonomous coding agent
Replit Agent - Replit's AI pair programmer
OpenAI Codex - The model behind GitHub Copilot
Devin - Cognition's AI software engineer

The Results: Nobody Passed

Here's the breakdown by tool:

Claude Code - 16 vulnerabilities (4 critical)

Anthropic's autonomous agent had the most security flaws. The critical issues included authentication bypasses and data exposure vulnerabilities.

Highest vulnerability count

Cursor - 13 vulnerabilities

Despite its enterprise adoption and $1B+ revenue, Cursor's generated code contained exploitable flaws in all three test applications.

Replit Agent - 13 vulnerabilities

The tool powering Replit's $9B valuation round produced the same vulnerability count as Cursor.

OpenAI Codex - 14 vulnerabilities

The foundation of GitHub Copilot showed similar security gaps.

Devin - 13 vulnerabilities

Cognition's "AI software engineer" matched the others in security failures.

What Types of Flaws?

The study found specific patterns in what AI tools get wrong:

Server-Side Request Forgery (SSRF) - 100% failure rate

Every single tool produced code vulnerable to SSRF when building the link preview feature. This attack lets malicious users make your server request internal resources or external services on their behalf - potentially accessing databases, cloud metadata services, or other systems that should be inaccessible.

Authorization Logic - Consistently broken

AI tools consistently failed to implement proper authorization checks. The pattern: they check if a user is logged in, but not if that user should have access to the specific resource they're requesting. This is the "BOLA" vulnerability (Broken Object Level Authorization) that tops the OWASP API Security Top 10.

Input Validation - Insufficient

While AI tools did catch some injection vulnerabilities (SQL injection and XSS were largely avoided), they failed at business logic validation. The code often trusts that inputs will be in expected formats without validating edge cases.

The Good News (Sort Of)

There were some areas where AI tools performed better than expected:

SQL Injection - Most tools used parameterized queries correctly
Cross-Site Scripting (XSS) - Basic output encoding was usually present
Password Storage - Hashing was generally implemented (though not always with optimal algorithms)

This suggests AI tools have learned from the most common, well-documented vulnerability patterns. The problem is everything else.

Why This Matters

The Tenzai study confirms what we've seen in our own audits: AI coding tools are great at producing code that works, but consistently fail at producing code that's secure.

The Pattern: AI tools excel at functional requirements but struggle with non-functional requirements like security, performance, and maintainability. They write code that does what you asked - but not code that safely handles all the ways bad actors might abuse it.

This isn't surprising when you consider how these tools are trained. They learn from existing code, much of which has the same security gaps. And prompts typically describe what the code should do, not all the ways it might be attacked.

The Compounding Problem

Here's what makes this worse: many vibe coders don't have the security expertise to catch these issues in code review. The whole point of vibe coding is that you can build without deep technical knowledge.

So you have a perfect storm:

AI tools that consistently produce insecure code
Users who can't recognize the vulnerabilities
Pressure to ship fast without security review
Real user data at risk once deployed

We've documented the most common issues in our 5 Security Holes I Find in Every Vibe-Coded App post. The Tenzai study validates that these aren't edge cases - they're the norm.

What To Do About It

If you've built (or are building) an application with AI coding tools, here's the reality check:

1. Assume Your Code Has Security Issues

Based on this study, there's roughly a 100% chance your AI-generated code has at least some security vulnerabilities. Don't assume it's safe because it works.

2. Focus on Authorization and Access Control

AI tools consistently fail at authorization logic. For every API endpoint and database query, ask: "Am I checking that this specific user should access this specific resource?" The AI probably isn't.

3. Don't Trust User-Controlled URLs

Any feature that fetches external resources (link previews, image proxies, webhooks) is a SSRF risk. Implement allowlists for domains and block internal IP ranges.

4. Get Human Review Before Launch

The Tenzai study makes it clear: AI tools aren't a replacement for security expertise. If you're handling user data, you need someone who understands security to review what the AI produced.

Check Your App Before You Ship

We've compiled the 10 most critical security checks for vibe-coded apps into a free checklist. These are the exact issues the Tenzai study found - and the fixes are often straightforward if you know to look.

Get the Free Security Checklist

The Bigger Picture

The Tenzai study arrives at an inflection point for AI coding. Cursor just passed $1B in annual revenue. Replit is raising at a $9B valuation. Claude Code is being called an "infinite vibe coding machine."

These tools are going mainstream, fast. And every month, more applications built with them are going into production with real user data.

The tools aren't going away - nor should they. AI-assisted coding genuinely accelerates development. But the industry needs to grapple with the security gap these tools create.

For now, the responsibility falls on individual developers and founders to catch what AI misses. That's not ideal, but it's reality.

What We're Watching

Several developments could improve this situation:

Security-focused AI tools - Startups like Cencori (from Nigeria) are building automated security scanners specifically for vibe-coded applications
Better prompting - Research into security-aware prompts that guide AI to consider attack vectors
Integrated security checks - Tools that review AI-generated code before it's committed
Education - As vibe coding matures, security literacy needs to follow

Until these mature, the burden remains on you to verify that your AI-generated code is safe to ship.

Bottom Line

The Tenzai study provides concrete data for what many of us in security have been saying: AI coding tools are powerful, but they systematically produce insecure code.

This doesn't mean you shouldn't use them. It means you should use them with eyes open, and always get security review before exposing your application to real users.

The 69 vulnerabilities Tenzai found weren't in edge cases or adversarial prompts. They were in basic web application features that millions of developers are building right now.

Check your code before your users find out the hard way.

Need Help Securing Your Vibe-Coded App?

Book a free 30-minute consultation. We'll review your application, identify the highest-risk vulnerabilities, and give you a clear picture of what needs to be fixed before launch.

Book Free Consultation