The state of vibe-coded security: 549 repos measured

The headline numbers

112 of the 549 repos — 20.4% — had at least one secrets finding: a hardcoded credential, key, or password committed into public code. That's the most consequential number in the study, because a committed secret in a public repo isn't a code-smell, it's an open door. 26.8% of repos had at least one critical or high finding of any kind, and 14.9% had at least one critical.

The most common individual findings were dangerouslySetInnerHTML usage (42.6% of repos), a .gitignore that doesn't cover .env files (35.7%), no .gitignore at all (15.3%), curl-piped-to-shell installs (9.8%), hardcoded passwords (9.3%), and wildcard CORS (11.3%). None of these are exotic — they're exactly the mistakes the most direct working version of the code makes, which is what AI tools generate by default.

Why the median A is misleading — read the mean and the tail

The median repo scored 97, an A, and if we wanted a flattering headline we'd stop there. We don't, because the median is inflated: 85 of the 549 repos have 15 or fewer scannable files — workshop demos, single-page toys, docs-heavy "how to vibecode" repos — and a tiny repo trivially scores an A because there's almost nothing to flag. They genuinely are self-described vibe-coded output, so they were counted, but they drag the median up.

The honest read is the mean and the grade distribution: mean score 89.0, with 353 repos at A, 69 at B, 92 at C, 21 at D, and 14 at F. That's 196 repos — more than a third — with real findings, and the lowest score in the corpus was 15. For repos with substance, the picture is the per-category hit rates below, not the median.

Cutting the corpus to repos with 15 or more scannable files (n = 467) makes the point directly: in that substantive subset, 23.3% had an exposed-secret finding — roughly 1 in 4 — 30.2% had a critical or high finding, 48.8% had an injection finding, and 43.0% used dangerouslySetInnerHTML. Every hit rate rises once the toy repos are out.

Where the findings concentrate, and what we didn't measure

By category, 54.5% of repos had at least one data-exposure finding, 47.5% injection, 20.6% dependencies and supply chain, 20.4% secrets, 14.0% auth, and 8.7% transport. Findings per repo averaged 5.4 (median 2) — and because the engine caps reporting at 10 findings per rule and 300 per repo, the counts for the messiest repos are floors, not totals.

Two limits worth stating plainly. The study ran the heuristic rules engine only — the Claude review that's part of every real secure·vibes scan was off — so anything that needs code-reading judgment went uncounted, and scores would likely shift down with it on. And "vibe-coded" means self-described: we took repos at their word, not verified provenance. About this study: 549 public GitHub repos self-described as AI- or vibe-coded, data collected July 2026, scanned by the secure·vibes rules engine — methodology in full on the how-we-benchmarked page.

Share of the 549 repos with at least one finding, by secure·vibes category (rules engine only, July 2026)

Category	Repos with ≥1 finding	Most common finding inside it
Data exposure	54.5%	.gitignore missing .env coverage (35.7% of repos)
Injection & unsafe code	47.5%	dangerouslySetInnerHTML (42.6% of repos)
Dependencies & supply chain	20.6%	curl \| sh installs (9.8% of repos)
Secrets & credentials	20.4%	hardcoded passwords (9.3% of repos)
Auth & access control	14.0%	wildcard CORS (11.3% of repos)
Transport & TLS	8.7%	disabled verification / plain-http calls

frequently asked

Does this prove vibe-coded apps are insecure?

It proves something narrower and more useful: about one in five self-described vibe-coded repos ships a secrets finding, and about one in four has a critical or high finding — measured by deterministic rules, not opinion. Most repos scored well; the tail is real and predictable.

Why lead with 20.4% when the median grade is an A?

Because the median is inflated by tiny demo repos — 85 of the 549 have 15 or fewer files and trivially score A. A committed secret is binary and damaging regardless of repo size, so the hit rate is the honest headline. We show the full grade distribution either way.

Was AI used in the scoring?

No. The study ran the secure·vibes rules engine only — the Claude review in normal scans was deliberately off, so every number here is reproducible pattern detection. Real scans add an AI pass on top, which typically finds more, not less.

Can I see how my repo compares?

Yes — paste your public GitHub repo link into secure·vibes and you get the same six-category scan, scored 0–100 with a letter grade, in under a minute. The free tier covers one scan, no card.

Published June 11, 2026 · Last updated July 25, 2026

The state of vibe-coded security: we scanned 549 self-described AI-built repos

The headline numbers

Why the median A is misleading — read the mean and the tail

Where the findings concentrate, and what we didn't measure

Share of the 549 repos with at least one finding, by secure·vibes category (rules engine only, July 2026)

frequently asked

more on secure·vibes

related across the studio

ready to try secure·vibes?