Building cviz: A Spatial Map of Your Codebase

Nobody has a favourite editor for agentic coding yet. The tools are all cosmetically different and functionally similar. I wanted to know if there was a genuinely different way to look at a codebase — not a file tree, not a search box, but something spatial.

The result is cviz: a native Rust + wgpu application that turns any git repository into an interactive 2.5D map.

The core idea

Two files are close together if they change together. That’s the entire layout principle. You compute a co-change matrix from the git log — every pair of files that appear in the same commit gets a higher affinity score — and then run a force-directed simulation where high-affinity pairs attract and everything else repels.

On top of that, TF-IDF semantic similarity (treating identifiers in filenames and paths as tokens, splitting on camelCase and snake_case) provides a secondary signal. Files with semantically similar names drift toward each other even if the co-change history is sparse.

The result is a layout where clusters emerge organically. The auth module ends up in one corner, the migrations nearby, the test files forming a halo around the code they cover. You didn’t tell it any of this. It’s just what falls out of the history.

The rendering pipeline

Five async stages run on a tokio background thread:

GitCollector — walks the commit graph with git2, builds a Jaccard co-change matrix
Embedder — TF-IDF with a camelCase/snake_case tokenizer
LayoutEngine — Barnes-Hut O(n log n) force-directed simulation, 500 iterations
Renderer — wgpu instanced circles, glow shaders, edge lines, convex hull directory backgrounds, an embedded 8×8 bitmap font for labels
SocketListener — Unix domain socket for real-time event ingestion

The Barnes-Hut tree is the reason this is usable on repos with hundreds of files. A naive force simulation is O(n²) per iteration. Barnes-Hut approximates distant clusters as point masses, getting it to O(n log n). For a 350-node repo, that’s the difference between a simulation that takes seconds and one that takes minutes.

Convex hulls and labels

Each directory gets a convex hull drawn around all its files. The hull is computed with a gift-wrapping algorithm on the final node positions. This gives you the directory structure without a file tree — you can see at a glance which cluster of nodes is src/auth/ and which is tests/.

Labels are rendered with an 8×8 bitmap font embedded directly in the binary. I didn’t want a font-loading dependency for what is essentially debug text. The font atlas is baked into a wgpu texture at startup. Labels appear at a zoom-dependent threshold so they don’t clutter the view when zoomed out.

The agent hook

This is the part I’m most interested in. When Claude Code reads or edits a file, a PostToolUse hook fires a JSON event to a Unix socket at /tmp/cviz-{hash}.sock. The SocketListener picks it up and sends it to the Renderer: that file’s node gets a cyan tint and a ring drawn around it.

The practical effect: you watch the agent navigate your codebase in real time, spatially. It reads src/auth/views.py, then tests/auth/test_views.py, then src/auth/serializers.py. You see the path it’s taking. You notice it never looked at src/auth/permissions.py before touching something related. That’s signal.

The hook is a one-liner shell script:

#!/bin/bash
echo '{"tool":"'"$CLAUDE_TOOL_NAME"'","path":"'"$CLAUDE_TOOL_INPUT_PATH"'"}' \
  | nc -U /tmp/cviz-$(echo "$PWD" | md5).sock 2>/dev/null || true

What I tested it on

Three repos with very different characteristics:

Repo	Files	Co-change pairs	Notes
ising-rs	352	1,511	Dense core, clear directory clusters
physics-llm-research	30	2	Embedding-driven clustering dominates
portfolio	237	21	Sparse co-change, directory structure visible

The ising-rs result was the most satisfying. The Rust source files cluster tightly in the centre — they change together constantly. The Python notebooks form their own island. The CUDA kernels sit between them, sharing history with both sides.

What’s next

The obvious improvement is real embeddings. TF-IDF on filenames is cheap and surprisingly good, but Ollama running a code model locally would give you semantic clustering based on actual content, not just naming conventions.

I also want a trail animation — when the agent moves between files, draw the path it took as a fading arc. That makes the navigation pattern readable at a glance rather than requiring you to watch in real time.

The longer-term question is whether this is useful beyond being interesting. My intuition is that spatial memory is genuinely helpful for large repos where you’ve lost the mental model of what lives where. The layout makes that model external and persistent. Whether that translates into faster debugging or better code review is an experiment I haven’t run yet.

Source on GitHub →