Agent Friendly Code

A public dashboard that ranks open-source repos by how friendly they are for AI coding agents — per model.

Next.js 16 + SQLite (better-sqlite3), styled with Tailwind CSS 4. Spans GitHub, GitLab, and Bitbucket out of the box. Current release: 0.5.0.

Dark mode

Follows prefers-color-scheme automatically — same tokens, different values.

The idea

AI coding agents — Claude Code, Cursor, Devin, GPT-5 Codex, Gemini CLI, Aider, OpenHands, Pi — succeed dramatically more often on some repos than others. The difference is rarely the agent; it's the repo. A codebase with fast tests, a clear AGENTS.md, a Makefile, and CI is a massively different environment than one without.

Goal: a public leaderboard where anyone can look up a repo and see:

How agent-friendly is it overall?
How friendly is it for my agent? (Claude Code weights AGENTS.md heavily; Devin cares about CI + reproducible envs; Cursor prefers strong types + a good README.)
Why does it rank there, and what would it take to improve for my agent? — top-3 gaps ranked by score-gain.

Two audiences:

Maintainers — a ranked checklist of what to fix to make their repo friendlier to agents.
Agent users — when picking between forks, packages, or alternatives, agent-friendliness is a real dependency-choice signal.

Prior art + what we do differently

Project	What it does	What we do differently
Factory.ai Agent Readiness	Single-tenant scanner: you point it at your repo and get a score with auto-fix PRs. 8 pillars, 5 maturity levels.	Public + cross-forge + per-model. Factory rates your own repo in isolation — we rank across repos, on GitHub/GitLab/Bitbucket, and the ranking changes based on which agent you care about.
`kodustech/agent-readiness`	OSS alternative to Factory — static checks, local scan.	We're a public ranking service, not a local scanner. Scoring logic is similar in spirit; the product is the leaderboard and the per-model lens.
`jpequegn/agent-readiness-score`	Explicitly "inspired by Factory.ai" — OSS framework to measure codebase readiness.	Same delta as above — single-tenant vs. public + per-model.
`viktor-silakov/ai-ready`	39 checks, 7 pillars, 10+ languages. Scanner.	Same delta.
`ambient-code/agentready`	Assesses git repos against evidence-based attributes.	Same delta.
Cloudflare Agent Readiness	Rates websites for agent consumption.	Wrong object — we rate code repos.
Fern Agent Score	Public leaderboard rating documentation sites for AI-readiness.	Closest in shape — public + leaderboard — but scores docs, not code.
Clarvia	Scoring platform for MCP servers (Agent Experience Optimization).	Adjacent — rates tools, not repos.
SWE-Bench (Verified / Pro), GitTaskBench, FeatureBench, HAL, PR Arena	Rank agents on a fixed set of repos.	We want the transpose: rank repos per agent. Our measurement story (once the benchmark harness lands) looks a lot like these, with the axes flipped.
GitHub Trending, ossinsight	Popularity / activity rankings.	Stars ≠ agent-friendliness.

Our differentiators, in one line: cross-forge, public, per-model, and explainable — every score decomposes to signals, and every repo page shows what to improve next for the selected model.

Honest product concerns

Not pretending the idea is free of risk:

Per-model scoring is the hardest part and the easiest to fake. Per-model rationales are now sourced from each agent's published docs (see MODELS[].sources in lib/scoring/weights.ts), but the weight values themselves are still pre-benchmark. Real "Claude ranks this higher than GPT-5" requires actually running each agent on each repo. That's tasks/1.0.0/03-benchmark-harness.md.
Factory.ai is already in this space. Differentiation has to stay sharp.
Public-shaming risk. Ranking #47,823 without consent invites angry maintainers. Planned via tasks/0.7.0/01-opt-out-claim-flow.md.
Score gaming. Once public, people add boilerplate AGENTS.md to pass the rubric without being useful. Dynamic (actually-run-an-agent) checks are the counter — see benchmark harness.
Freshness. Scores decay with every push. Webhook-driven rescoring is roadmap.

See /methodology in the running app for a candid walkthrough of what's measured today and what isn't.

Security posture (FAQ)

Short answer: low risk. The app:

Only reads files after a shallow clone; never executes anything from the cloned tree (no npm install, no post-clone hooks).
Uses --depth 1 --single-branch and never clones submodules.
Runs all SQL via prepared statements.
Renders through React (auto-escaping); the only dangerouslySetInnerHTML use is server-built JSON-LD with < escaped to <.
Has no auth and no writable API endpoints — read-only dashboard.

Operational concerns for a public launch (not code-level security):

Disk quotas for the clone workspace.
Rate limiting the public API.
Sandbox the cloner in a container (future-proofing against hypothetical git CVEs).

Auth and per-maintainer controls land with the opt-out / claim flow in v0.7.0.

Quickstart

bun install
bun run prepare-hooks  # once — installs lefthook pre-commit (Biome + tsc + test + file-length)
bun run seed           # score the curated set across GH / GL / BB + cache popular package aliases
bun run dev            # http://localhost:3000

Score a single repo:

bun run score https://github.com/vercel/next.js
bun run score https://gitlab.com/gitlab-org/cli
bun run score https://bitbucket.org/snakeyaml/snakeyaml
bun run score /path/to/local/checkout

Optional: GITHUB_TOKEN / GITLAB_TOKEN in env to raise API rate limits.

Run the unit tests with bun run test (uses node --test + tsx; requires Node ≥20.9.0).

Versioning

lib/version.ts and package.json carry the current release number (currently 0.5.0). Bumps happen only when we actually cut a release — never when merging intermediate work. The version pill in the header surfaces the number directly; /changelog lists what each release shipped.

Stack & rationale

Choice	Why	When we'd revisit
Next.js 16 (App Router)	Future features (filters, charts, auth, diff views) are React's territory. File-based routing + API routes replace hand-rolled HTTP cleanly.	Unlikely. The core scorer is stack-agnostic, only `app/` depends on this.
Node runtime (with `tsx` for CLI scripts)	Matches Vercel's serverless runtime — no Bun-only imports in prod. Bun still works locally as a fast package manager.	Unlikely — only if the deployment target changes.
Tailwind CSS 4	Zero-config via `@theme` tokens, no `tailwind.config.*` needed. Tight bundle output.	Would only leave for something with a stronger design-system story.
`better-sqlite3`	Single file, inspectable, zero ops overhead. Node-native so Vercel's serverless runtime can load it directly.	Postgres when concurrent writers / access control arrive (`tasks/1.0.0/01-postgres-migration.md`).
Server-first; client islands where needed	Cheap, fast, SEO-friendly. Client components only where interactivity demands — mobile nav, search, selects, copy, back-to-top.	When client islands grow past presentational interactivity → reach for client state mgmt.
Shallow git clones (`--depth 1 --single-branch`)	Bandwidth + speed. Current signals don't need history.	If commit-history ever enters scoring → host APIs or `--filter=blob:none` partial clones.
Exact-pinned deps	Deterministic scoring across environments.	Never.
One file per signal	Each signal is a small, independent concern — keeps `git log` and code review focused.	When we bundle signals into dynamic checks (then the unit becomes the bundle).

Why do we clone at all (instead of host APIs)?

Static signals need to read file contents (AGENTS.md length, pyproject.toml [tool.X] sections, package.json scripts count) — not just existence.
One clone is faster than N API calls for content-heavy scoring, and respects rate limits.
Any real version of this dashboard needs dynamic signals (run tests, run an agent). Those absolutely need code on disk.

Layout

app/          Next.js App Router — pages + API + SEO
  layout.tsx       root layout, root metadata (OG + Twitter cards)
  page.tsx         leaderboard
  repo/[id]/       repo detail (generateMetadata + per-repo OG image)
  methodology/     how scoring works today
  roadmap/         upcoming versions (from lib/roadmap.ts)
  changelog/       what's shipped (from lib/changelog.ts)
  about/           independent project, no vendor affiliation (footer-linked, E-E-A-T)
  action/          PR-diff GitHub Action explainer + install snippet
  skill/           agent-skill explainer + install command
  package/         registry → repo lookup (form + per-package state pages)
  api/             /repos, /repo/[id], /score, /badge/<host>/<owner>/<name>, /package/<registry>/<name>
  robots.ts        /robots.txt — allows "/", blocks "/api/"
  sitemap.ts       /sitemap.xml — static routes + every repo
  llms.txt/        markdown manifest for LLM crawlers
  globals.css      Tailwind import + @theme tokens
components/   React components (Tailwind-styled)
lib/
  scoring/    signals, weights, scorer — pure, no I/O outside the cloned tree
  clients/    git clone, host API, npm/PyPI/Cargo registries
  constants/  thresholds, host labels, sort keys
  utils/      format + score-tier helpers, SVG badge renderer, package-request URL builder
  db.ts       better-sqlite3 schema + queries (all SQL lives here)
  package-lookup.ts  shared registry → repo lookup (used by /api/package + /package page)
  version.ts  app + sibling URLs, install snippets (ACTION_USES, SKILL_INSTALL_CMD), SIBLING_VERSION pin
  changelog.ts / roadmap.ts / skill-content.ts
scripts/      CLI entries run via `tsx` (Node) — score, seed, init-db
tests/        `node --test` unit tests — scorer, signals, URL parser, formatters
tasks/        Per-version task breakdown (agent-readable)
public/       Static assets — demo/ screenshots used by the README + OG image
.claude/      settings.json, hooks/ (Stop guard), skills/
data/         rank.db (committed — shipped as a build artifact; rescoring runs locally)
AGENTS.md     Agent instructions (source of truth)
CONTRIBUTING.md  Human-contributor guide — PR workflow, review bar
CLAUDE.md     Pointer → AGENTS.md
LICENSE       MIT

Companion: PR-diff GitHub Action

hsnice16/agent-friendly-action runs the same scorer inside your CI and posts a per-PR score-delta comment — "this PR drops your Claude Code score by 4.1 points because it removed CI config." Opt-in via an AGENTS_BADGE_TOKEN secret; falls through silently when unset. Each repo detail page on the dashboard ships a copy-paste workflow snippet under "Catch score regressions on every PR".

Companion: agent skill

hsnice16/agent-friendly-skill is a portable agent skill installable in one command — npx skills add hsnice16/agent-friendly-skill#v0 — that scores the user's current repo locally and recommends a model. Profiles the same 8 agents this dashboard does (Claude Code, Cursor, Devin, GPT-5 Codex, Gemini CLI, Aider, OpenHands, Pi); installs into any vercel-labs/skills-compatible host (Cline, Copilot, Continue, Roo Code, …) and produces identical output regardless of which host invokes it — scoring is score-driven, not host-driven. Same self-contained property as the action: vendored scorer, no service dependency, works offline. The dashboard's /skill page hosts the install command, the score → model mapping, and optional SessionStart hook snippets for Claude Code and Codex.

Public API

Read-only JSON endpoints for external integrators (skills, hooks, browser overlays, third-party tools):

GET /api/score?host=<host>&repo=<owner>/<name> — look up an indexed repo by host + owner/name. Returns { repo, signals, modelScores } on 200; { error: "not_indexed" } with status 404 when the repo isn't in our DB. The natural lookup endpoint for any tool that has a repo URL but not our internal id.
GET /api/repos — full leaderboard (id, owner, name, host, stars, overall_score, per-model scores).
GET /api/repo/<id> — per-repo detail (signals, model scores, top improvements). Requires the internal id; use /api/score first if you only have host + owner/name.
GET /api/badge/<host>/<owner>/<name>.svg — embeddable SVG badge. ?model=<id> for per-model variants.
GET /api/package/<registry>/<name> — resolve npm / PyPI / Cargo package → source-repo score (or unresolved when the registry doesn't expose a repo URL).

Neither the action nor the agent skill calls these at runtime — both vendor the scorer and run locally. The endpoints exist so any third party can build on top of the dashboard without a network round-trip becoming a critical-path dependency.

Roadmap (high-level)

See /roadmap in the running app or the per-version tasks/ folders for the full picture.

Versions are sequenced cheap-first so the highest-impact small additions don't get gated on heavy infra:

0.6.0 — auto-refresh + smarter matching: webhook-driven rescoring (keep scores fresh on every push) + alternatives via README embeddings (cross-language matches the v0.3.0 SQL heuristic misses).
0.7.0 — maintainer ownership + at-scale discovery: OAuth opt-out / claim flow for maintainers + at-scale package overlay (per-registry leaderboards + userscript that renders the badge inline on npmjs.com / PyPI / crates.io).
1.0.0 — production cut: Postgres migration for concurrent writers + auto-discovered crawl (target 10k repos) + benchmark harness that derives per-model weights from measured agent success. From here on, breaking API changes require a MAJOR bump.

Defensibility

The score isn't defensible. The evaluation harness + the cross-forge dataset + the maintainer network are. Open-source the harness, publish the weights, keep the data + dashboard + badge network as the product.

License

MIT — see LICENSE.

Contributing

See CONTRIBUTING.md for setup, branch/commit style, the PR description template, and the changelog discipline.

Sponsor

If this project is useful to you, consider sponsoring its development: github.com/sponsors/hsnice16.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.claude		.claude
.github		.github
app		app
components		components
data		data
lib		lib
public		public
scripts		scripts
tasks		tasks
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.nvmrc		.nvmrc
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
lefthook.yml		lefthook.yml
next.config.ts		next.config.ts
package.json		package.json
postcss.config.mjs		postcss.config.mjs
skills-lock.json		skills-lock.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Friendly Code

The idea

Prior art + what we do differently

Honest product concerns

Security posture (FAQ)

Quickstart

Versioning

Stack & rationale

Why do we clone at all (instead of host APIs)?

Layout

Companion: PR-diff GitHub Action

Companion: agent skill

Public API

Roadmap (high-level)

Defensibility

License

Contributing

Sponsor

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Friendly Code

The idea

Prior art + what we do differently

Honest product concerns

Security posture (FAQ)

Quickstart

Versioning

Stack & rationale

Why do we clone at all (instead of host APIs)?

Layout

Companion: PR-diff GitHub Action

Companion: agent skill

Public API

Roadmap (high-level)

Defensibility

License

Contributing

Sponsor

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages