Knowledge Entropy in Software Engineering

Knowledge entropy causes software teams to lose context about design decisions as codebases evolve and engineers leave. Learn why documentation fails and how continuous knowledge capture reverses decay.

May 20, 2026

The Doc Holiday Team

Knowledge Entropy in Software Engineering

If you want to understand how software teams actually work, don't look at their architecture diagrams. Look at their Slack search history.

Somewhere in your company's archives, there is a two-year-old thread where a senior engineer—who left 18 months ago—explains exactly why a specific database query was written in a way that looks completely irrational. A junior engineer will spend three hours reverse-engineering that query today, fail to understand it, and rewrite it. Next week, the system will crash under load, and someone will eventually rediscover that old Slack thread.

This is knowledge entropy. It is the organizational equivalent of the second law of thermodynamics: systems naturally drift toward disorder. In software, that means undocumented decisions, lost context about why code was written a certain way, and tribal knowledge that walks out the door when engineers leave.

Junior engineer at desk with ghostly senior engineer in Slack bubble above — The original author left, but their design decisions stayed cryptic.

Anyway. We all know this happens. But we tend to treat it as a moral failing—a lack of discipline among engineers who just won't write things down. That framing is wrong, and it leads to solutions that don't work.

The Physics of Forgetting

Knowledge entropy in software is structural, not personal.

David Parnas described this problem in 1994, calling it software aging: programs, like people, get old. The first type of aging happens when software isn't updated to meet changing needs. The second, more insidious type happens as a direct result of the changes that are made. When developers modify code without fully understanding the original design concept, the structure degrades. After enough of those changes, nobody understands the product—not the original designers, and certainly not the people who made the changes. The documentation, meanwhile, becomes increasingly inaccurate, making future changes even harder.

This is the core mechanism of knowledge entropy. The code evolves. The context doesn't travel with it.

There are several structural reasons why this happens, and none of them are about individual discipline. Codebases evolve faster than documentation can keep up. Engineers optimize for shipping features, not recording rationale—and this is entirely rational behavior given how most teams measure productivity. The documentation that does exist lives in different systems than the code itself: wikis, Confluence, Notion, Slack threads. High-growth teams onboard people faster than they can transfer context. Refactors and technical debt paydowns erase the original problem statements that justified the code.

Research on software change entropy confirms that files touched by a higher number of developers tend to exhibit higher disorganization over time, and that changes introducing new features produce significantly more structural disorder than bug fixes. The more people work on a system, the faster the context fragments.

Code evolution rising sharply while documentation flatlines, gap widening between them — The codebase doesn't wait for documentation to catch up.

The Symptoms of Decay

You can see knowledge entropy happening at different scales across an organization.

Junior engineers spend hours reverse-engineering decisions that were obvious to the person who wrote the code. Studies show that developers spend between 58% and 70% of their time trying to comprehend code, and only about 5% of their time actually editing it. That ratio gets worse as a codebase ages and its original authors leave.

Support teams file duplicate bugs because no one knows a workaround was documented two years ago in a Slack thread. Product managers can't answer "why did we build it this way?" without archaeologically reconstructing old Jira tickets. Security and compliance audits turn into week-long evidence-gathering exercises because no one can locate the original design documents.

The problem compounds. Research by Martin Robillard at McGill found that when contributors leave a software project, the knowledge they hold may become entirely inaccessible to the remaining team, impacting both code quality and team productivity. In interviews with 27 professional developers and managers across three companies, Robillard documented how knowledge loss creates a cascade: developers have to reverse-engineer what was previously known, slowing delivery and introducing new errors. One developer described it directly: "We have to reverse engineer and sometimes we have to look for knowledge. We have to find something, which was probably written somewhere before, and the biggest impact is how fast we can deliver the solution."

Knowledge entropy also isn't linear. It accelerates. The more fragmented your documentation, the harder it is to know where to add new documentation, so people stop trying. The more engineers leave, the more undocumented context disappears, which makes the next wave of turnover even more damaging. A 2009 study measuring knowledge loss through software archaeology found that orphaned code—code authored by developers no longer on the project—accumulates steadily across all types of software projects, from volunteer-driven open source to company-supported products.

Organizational Scale	Observable Symptom	Root Cause
Individual engineer	Hours spent reverse-engineering old code	No documented rationale for design decisions
Support team	Duplicate bugs filed; known workarounds lost	Knowledge siloed in Slack threads and departed employees
Product team	Can't explain why features were built a certain way	Context erased by refactors and ticket archaeology
Security / compliance	Week-long evidence-gathering for audits	Original design documents not linked to current code

Why the Usual Fixes Fail

The traditional response to knowledge entropy is to tell people to write better docs.

This ignores the fact that documentation is a lagging indicator. By the time you realize you need it, the person who could have written it is gone.

Retrospective documentation projects are expensive and often produce documents no one reads because they're not tied to active workflows. A systematic mapping study on documentation in continuous software development, covering 63 studies published between 2001 and 2019, found that the most persistent challenges include documentation being considered "waste," productivity being measured only by working software, and documentation falling out of sync with the software itself. These aren't new problems. They're structural features of how most engineering teams operate.

Wiki-based knowledge management systems become digital landfills. No one knows what's current, what's deprecated, or what's aspirational. The docs-as-code movement—which promised to solve this by treating documentation like code—has run into the same structural problem: the philosophy gets adopted as a set of tools, but the processes and integration are ignored. Even when processes are defined, the tools make it easy to circumvent them, often unintentionally.

The deeper issue is that documentation has never been treated as a first-class engineering artifact. Durst and Wilhelm observed that management appears "too busy with the day-to-day running of the business" to prioritize knowledge management as a strategic concern. That's not negligence. It's a rational response to incentive structures that reward shipping features, not capturing context.

And the problem is getting worse. Research from DX found that AI code generation tools are accelerating the rate of code rot by multiplying the surface area of change. More code, more decisions, more drift per sprint. Teams are building faster than ever, but also introducing complexity faster than they can manage it.

The Byproduct Approach

There is a different way to think about this.

Continuous knowledge capture means documentation generates as a byproduct of the work itself, not as a separate disciplined act. It means tying documentation directly to the engineering workflow—commits, PRs, releases, tickets—and generating structured output from the work engineers are already doing. The CI/CD pipeline for documentation delivery is the same idea applied to knowledge: use automation to keep documentation in sync with code changes, not as a manual reconciliation task.

Google's research on developer onboarding found that the three top hindrances to ramping up are learning a new technology, poor or missing documentation, and finding expertise. Two of those three are directly addressable by continuous knowledge capture. When documentation is generated from engineering activity rather than written retrospectively, new engineers have access to accurate context from day one—not a wiki full of aspirational architecture diagrams that haven't been updated since the last reorg.

Knowledge entropy is reversed when documentation becomes a continuous output, not a periodic project. Teams that generate release notes, API references, and changelogs directly from their engineering workflows maintain the connection between code and context. The documentation isn't aspirational. It is a reflection of what actually shipped.

This is a workflow problem. The discipline framing has never worked because it asks engineers to do extra work with no feedback loop and no enforcement mechanism. The byproduct framing works because it doesn't ask engineers to do anything extra—it captures the context that already exists in the commits, PRs, and releases they're producing anyway.

Doc Holiday generates documentation directly from engineering activity—release notes, changelogs, API references—so the connection between code and context never breaks. It works best when a skilled technical writer is validating and managing the output, not trying to recreate it from scratch. The system does the generative work; the human ensures it's accurate, complete, and useful. That's how you fight entropy at scale.