Rust Tool Sem Tracks Code Changes by Entities, Not Lines
Sem treats code as entities, not lines, and that could make Rust diffs, merges, and AI patches far less noisy. The real question is whether semantic version control can outgrow a niche and reshape Git workflows.

Why line diffs are starting to crack
Git line diffs were built for a world where the main reader was another human scanning a patch. Rohan Sharma argues that model breaks down when the audience is also an AI system, or when the change itself is buried under reformatting noise, broad refactors, and large files. In that setting, a line-by-line view can hide the real story: which function changed, which type moved, and which downstream code is actually at risk.

That is the opening bet behind sem, an entity-level version control CLI written in Rust. The idea is simple enough to explain and ambitious enough to matter: stop treating source as raw text, and start tracking named blocks such as functions, classes, methods, and types. For a community that already lives inside Git, the appeal is not that source control disappears, but that it becomes more semantic without asking developers to abandon familiar workflows.
How sem turns code into a graph
Sem uses tree-sitter to parse source into structured entities, then builds a dependency graph with petgraph so it can reason about how pieces of code relate to each other. That graph is the real engine of the tool: instead of asking only what lines moved, sem can ask which entity changed, which entity depends on it, and how far the effect spreads. Rayon handles parallel parsing, which matters once the tool is pointed at multi-language repositories and real-world scale.
Sharma’s design also leans on structural hashing and entity matching, two ideas that are central to making the system usable. If a function is reformatted, moved, or lightly edited, the tool needs to know whether it is still the same function in semantic terms. That is the difference between a patch that reads like churn and a patch that reads like intent.
The CLI surface stays recognizable enough to fit into normal Git habits:
- `sem diff` for semantic changes
- `sem entities` to inspect the code units being tracked
- `sem impact` to trace what a change could affect
- `sem blame` and `sem log` for history with structure intact
- `sem context` to pull the surrounding semantic picture
That list matters because it shows sem is not trying to be a toy parser. It is trying to be a working source-control layer for developers who want Git’s history with more meaning attached to every step.
What this changes in daily Rust work
The biggest immediate win is in code review. A Rust refactor can touch signatures, move impl blocks, and shift code across files without changing behavior, yet a line diff still makes reviewers mentally reconstruct the old structure from scattered edits. Sem’s entity model narrows the focus to the named units that actually changed, which makes a review more readable when the patch is large or mechanically generated.
This is where AI-assisted coding makes the argument sharper. If a model produces a broad patch, the question is no longer just whether the patch compiles, but whether the patch preserved the right entities and the right dependencies. A semantic diff gives humans and tools the same map, which could make generated code easier to audit instead of simply easier to apply.
For Rust developers, that also changes how refactors feel. Type-heavy codebases often need changes that are logically small but textually sprawling, especially when a trait, impl, or helper function shifts shape. A semantic view has a better chance of showing the actual design move instead of the incidental churn around it.
Weave pushes the same idea into merge conflicts
Sem is only half the story. Sharma also describes weave, a Git merge driver that uses the same entity model to resolve conflicts. The practical promise is easy to grasp: if two developers edit different methods inside the same class, the merge should not erupt into a conflict just because the lines happen to live near each other.
That is a real workflow improvement, and it gets more valuable as codebases and review surfaces grow. Git line merges are excellent at spotting overlapping text; they are much less intelligent about whether two edits collide semantically. Weave tries to reduce the spurious conflict tax that teams pay every time structure and text disagree.
This is where the bigger tooling question comes into focus. If the merge layer can understand entities, then the diff layer can become more than a patch viewer, and the conflict layer can become more than a text-fighting arena. That is not a small feature request. It is a different model for how source control interprets code.
Why the graph experiment matters more than the buzz
The strongest evidence in Sharma’s broader work is not a slogan, it is a graph traversal experiment. He reports planting 141 bugs across 52 pull requests in 5 open source repositories, then testing graph-based approaches on them. The BFS call-graph method found 95 percent of the planted bugs, and it found all 29 high-criticality bugs.
That result matters because it compares structure-aware analysis against the kind of text-first review flow Git users know best. In a separate framing of the same work, the graph traversal is also positioned against GPT-5.2, which found 56 percent. The message is not that AI is useless; it is that a zero-cost graph traversal can outperform a powerful model when the underlying question is about code relationships, not prose.
Sharma says the related graph work spans 21 languages, which widens the significance beyond a single Rust tool. Sem is written in Rust, but the design is aimed at multilingual repositories and mixed-language systems, the kind that dominate real production environments. That is exactly why Rayon, tree-sitter, and petgraph matter together: they turn a research idea into something that can plausibly run across large codebases without collapsing under its own weight.
The real test for semantic version control
Sem is interesting because it does not ask Git to disappear. It asks Git to become more aware of the units developers already think in when they read code: functions, methods, types, and dependencies. If that model keeps winning in merge conflict handling, refactor review, and machine-generated patch inspection, the line diff may start to look like a legacy interface rather than the default truth.
For Rust teams, the shift would not be cosmetic. It would change what a patch means, what a merge conflict means, and what a review is actually looking at. That is the kind of tooling change that can move from niche experiment to everyday habit when the ecosystem decides that text alone is no longer enough.
Know something we missed? Have a correction or additional information?
Submit a Tip

