Skip to content

hementhu/codemap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

codemap

Interactive visual map of any Python codebase — so you can understand the structure in 20 minutes instead of 2 days.

Story view — codemap pointed at its own source, showing the inferred purpose, file categories, and a ranked reading order

Why this exists

Developers using AI coding tools (Claude Code, Cursor, Copilot) generate large chunks of code they never fully read. When something breaks, the only fallback is line-by-line debugging. codemap restores comprehension without slowing generation down: point it at any Python codebase and get a clean view of how the files connect, what each function takes / returns / calls, and — for every claim the tool makes — the exact source lines backing it.

No LLM in the engine. No hallucination, no API costs, no sending your code to a third party. Just ast and a graph.

What you get

  • Story view — a text-first landing page that says what the codebase is (purpose inferred from imports), how it's organized (entry / core / domain / utility / test), and where to start reading (ranked, with a reason per file derived from the actual code).
  • Architecture map — files as nodes, imports as arrows (arrowhead at the importer). Click any file to drill in; click any arrow to see the actual import statements.
  • File detail — classes and functions with intra-file call arrows labelled with their return type (→ FileInfo, → Iterator[Path]), plus "ghost" nodes showing which other files consume this one and which libraries it pulls in.
  • Per-symbol verification surface — for every function / class / file, the tool ships: signature with types, decorators, instance attributes, the actual source lines (numbered, with file:line-end refs), a fact-derived "actual intent" sentence, and the docstring labelled as "documented intent (verify against code)."
  • Click an arrow — get the exact line where the call happens, the callee's signature, and a jump-to to either endpoint.

Quickstart

git clone <repo-url> codemap
cd codemap
python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # macOS / Linux
pip install -r requirements.txt
python server.py /path/to/your/python/codebase

Open http://127.0.0.1:8000.

Use --exclude <dirname> to skip directories (e.g. --exclude targets --exclude .selftest). The walker already skips venv, .venv, __pycache__, .git, node_modules, build, dist, etc.

Three things to try

1. Story view on a repo you don't know. Read the headline, the cast of characters, and the reading order (the screenshot at the top of this README). Does it match what you'd tell a new hire to read first?

2. Click a file → see its internals. The file detail view shows classes and functions with intra-file call arrows labelled by return type, plus "ghost" nodes for external consumers and dependencies. Click any function — the right panel reveals its signature, a fact-derived actual intent, and the real source lines (numbered, matching the file:N-M ref above them). The docstring is shown separately as documented intent; when the two diverge, that's signal.

File detail — codemap looking at its own parser.py, with call arrows, return-type labels, and ghost nodes for external consumers/deps

3. Click any arrow. For a call edge: the right panel shows the line where the call happens, the callee's signature, and jump-to chips for either endpoint. For an import edge in the overview: the actual import statements with line numbers.

What it correctly drops

  • Builtins (len, str, isinstance, …) — not shown as call targets.
  • Dynamic dispatch (obj.foo() where obj's type isn't statically obvious) — not guessed at.

This is intentional. The 80% precision bar comes from refusing to fake edges.

What's not in this version

  • Other languages — Python only.
  • Cloud / hosted version — run it locally.
  • "Open in editor" deep links from file:line refs.
  • AI provenance (which lines came from Claude / Cursor).
  • Behavior-to-code translation.

How it works (one paragraph)

walker.py enumerates .py files. parser.py runs ast.parse and extracts imports, top-level functions, classes (with methods), call sites (with line numbers), parameter signatures, decorators, instance attributes (self.X = ...), module constants, docstrings, and the numbered source slice for every symbol. graph.py builds a node/edge graph and resolves call targets deterministically (local → self.X → imports → external libs); unresolvable calls are dropped rather than guessed at. server.py (FastAPI) serves the graph as JSON to a static frontend that renders three views with Cytoscape: Story, Architecture Map, and File Detail.

No LLM. No network calls at runtime. Everything in the UI is derived from the AST.

Feedback

Run it on a Python repo you don't already know. Tell me whether it helped you understand it in less time than reading files manually would have — and where it fell short.

Status

v0.1 — comprehension layer over a Python AST parser. The CLI flags and /graph.json shape are not yet stable; expect changes between minor versions.

About

Interactive visual map of any Python codebase. Understand unfamiliar repos in 20 minutes instead of 2 days. No LLM

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors