Context debt: the hidden risk in government AI

Context debt
AI is only as good as the context it receives
What context debt looks like
Why this matters more in government
More retrieval capacity does not solve the problem
Different problem, familiar disciplines
From prompt libraries to context infrastructure
Questions agencies should ask before scaling AI
Context Control Center and Drupal
Context debt is a governance challenge

Date:: 17 June 2026
Author:: Kristen Pol
: Kristen is Project Lead for AI Context & Program Manager for the Drupal AI Initiative.

Context debt

Government agencies are deploying AI faster than they’re managing the information those systems depend on. That gap has a name.

It’s not only model risk. It’s not only data risk. It’s context debt.

Content debt is the familiar problem of published content that has lost its structure, ownership and discipline. Context debt is its AI counterpart, sitting in the operational layer between that content and the AI system reading it: the prompts, system instructions, retrieval rules, scoping logic, brand and tone guidance, and example libraries that shape what the AI actually says. When that operational layer is scattered, stale, duplicated or inconsistent across prompts, tools and teams, AI output suffers regardless of how good the underlying content is.

For government agencies, context debt can quietly undermine trust, accuracy, compliance and service quality, often before anyone notices.

AI is only as good as the context it receives

Large language models don’t operate in a vacuum. Every useful AI interaction depends on context.

Context can include:

The user's question and previous interactions
Relevant source content and policy documents
Service rules, workflows and eligibility criteria
Brand, tone, accessibility, privacy and security requirements
Structured metadata, permissions and access rules

When context is accurate, relevant, current and well-scoped, AI systems are more likely to produce useful results. When context is poor, AI systems can sound confident while giving the wrong answer.

Consider a straightforward scenario. A citizen asks a government chatbot whether they qualify for a support payment. The website's eligibility page is current. But the chatbot was configured months earlier to read from a cached snapshot of that page, and no one set up a refresh cycle. The bot confidently returns an answer drawn from the older rules. The citizen doesn’t apply. They miss out on support they were entitled to.

The content was fine. The context was broken.

This is especially risky in government settings, where people may be relying on digital services to understand eligibility, access support, complete forms, comply with rules or make important decisions. Note: another AI guardrail for areas like eligibility and compliance is Rules as Code, which uses deterministic logic rather than language model inference. In this example, the AI could interrogate the OpenFisca API instead of relying on content. This protects government further against AI hallucinations.

What context debt looks like

Context debt rarely appears all at once. It builds up gradually.

One team copies brand guidance into a prompt. Another uploads policy documents into a separate tool. A vendor configures a chatbot with its own instructions. Internal teams experiment with agents, search and content generation using different sources and rules. Meanwhile, the official service information keeps changing.

Each individual decision may make sense in isolation. But over time, the organisation ends up with multiple versions of "the truth" spread across prompts, documents, systems, tools and teams.

These symptoms sit specifically in the operational layer. Content quality problems sit alongside them and amplify them, but context debt is the layer where even good content can be misused.

Common symptoms include:

Different assistants giving different answers to the same question because they were configured with different prompts or scoped to different source sets
AI tools serving outdated copies of service information because the operational pipeline that feeds them has no refresh cycle
Prompt instructions scattered across code, documents, spreadsheets or private tools, with no version history or owner
No clear approval workflow or audit trail for the instructions an AI receives
Nothing to ensure context is scoped consistently or reused across AI experiences
The same source content being interpreted differently across tools because each was configured with different retrieval and ranking rules

This is context debt. Like technical debt or content debt, the organisation can keep moving for a while but the hidden cost grows. Eventually it becomes harder to maintain quality, explain decisions, improve systems or scale safely.

Why this matters more in government

All organisations need trustworthy AI. But government agencies face a higher standard.

Public sector AI must account for public trust, transparency and accountability. It must serve people with diverse needs and ensure accessible, equitable outcomes. It must comply with privacy, security and policy requirements. It must meet records, audit and governance obligations. And it must navigate complex eligibility rules, service pathways and user circumstances.

A poor AI response in a commercial setting may create frustration. A poor AI response in a government setting may prevent someone from accessing a service, misrepresent a policy, create inequitable outcomes or damage public confidence.

This is why government AI can’t rely on clever prompting alone. It needs governed context and, in some cases, other approaches such as Rules as Code.

More retrieval capacity does not solve the problem

As AI models improve, it’s tempting to assume that larger context windows or better retrieval will help with this issue. If a model can accept more information, or retrieve it more intelligently, why not simply give it more?

But more context isn’t the same as better context.

There’s also a technical reason to be sceptical of the "more is better" instinct, and it sits beneath the governance argument rather than alongside it. Recent research into how language models actually use the context they receive shows that more context isn’t free, and performance doesn’t scale with window size.

Models operate within a finite "attention budget"; their ability to recall, reason and follow instructions degrades as the window fills, regardless of how large that window is advertised to be. Information placed in the middle of a long context is recalled less reliably than information at the start or end, a finding the literature now calls "lost in the middle". Multi-turn interactions make this worse, because once a model anchors to an early wrong assumption it tends not to recover.

The effective context window is consistently smaller than the nominal one. Independent evaluations across frontier models from Anthropic, OpenAI and Google have found similar declines, and Google's own assessment of Gemini 3 Pro reported retrieval accuracy falling from 77% at 128,000 tokens to 26.3% at one million.

The practical implication for government is direct. An assistant drawing simultaneously on policy documents, eligibility rules, brand guidance, accessibility standards and a citizen's prior turns can become less reliable as more material is loaded into the context. Curated, scoped, governed context is not only a governance preference; it’s also what the underlying technology rewards.

The same caution applies to retrieval.

Retrieval-augmented generation (RAG) is a valuable part of many AI architectures. It allows systems to find relevant content before generating an answer. Larger context windows make it easier to include more documents, policies and instructions.

Both are useful. But they solve an engineering problem, not a governance problem. The questions that matter for government AI aren’t about retrieval capacity.

Agencies need to address:

Is this the right context for this task, audience and service?
Is it current, approved and owned by someone?
Is it permission-aware and free from conflicting guidance?
Can we audit what context was used later?

No retrieval system answers those questions. And larger context windows can make the problem worse, not better, by giving AI systems more confident access to more unmanaged information.

For government use cases, retrieval and capacity are useful tools. But they need to sit within a broader context governance strategy, not replace one.

Different problem, familiar disciplines

Context isn’t content, but it needs many of the same disciplines that good content management already provides. Government websites already manage complex content ecosystems with workflows, permissions, revision histories, publishing controls, structured types, metadata, taxonomy, accessibility standards and governance processes. In the same way a CMS enforces that only approved content reaches a published page, context governance ensures only approved instructions, prompts and retrieval rules shape an AI-generated response.

If an AI assistant uses a piece of guidance to answer a citizen's question, that guidance shouldn’t sit in a prompt that only one developer or vendor can see. It needs an owner, a lifecycle, a review process, a reuse model, clear scoping and an audit trail. These are the same requirements an agency already applies to its published content. What’s new is applying them to the operational layer that sits between the content and the model.

This is where content management and AI governance overlap, but they aren’t the same job. Managing content well is necessary and not sufficient. The context layer is where unmanaged prompts, scattered retrieval rules and undocumented system instructions can undermine even a well-governed content estate.

From prompt libraries to context infrastructure

Many organisations begin with prompt libraries. That’s a natural early step. But as AI use matures, agencies need to move from prompt libraries to context infrastructure.

Maturity level	What it looks like
Ad-hoc prompting	Teams paste instructions into individual prompts or tools
Prompt libraries	Instructions become reusable; governance is still limited
Managed context	Context has ownership, workflow, scope, permissions and reuse
Context infrastructure	Context is programmatically selected, governed, audited and reused across agents. It’s not just managed by humans but embedded in how systems operate

Retrieval techniques such as RAG and semantic search can appear at any of these stages. They’re technical capabilities rather than maturity steps. Governed retrieval over governed context is far more useful than sophisticated retrieval over an unmanaged source set; and unmanaged retrieval over an unmanaged source set tends to amplify the symptoms above rather than mitigate them.

The distinction between the last two levels matters. Managed context means your team can edit, approve and track context assets. Context infrastructure means those assets flow reliably to the right AI systems, at the right time, without requiring manual coordination for every deployment.

This is especially important as organisations move toward agentic AI. A single chatbot may need a set of instructions. A network of agents needs a governed context system. A content generation agent, a service navigation assistant, a policy explainer and an internal support bot shouldn’t all receive the same information in the same way. They need context that is relevant to their purpose, constrained by permissions, aligned with policy and appropriate for the user interaction.

Questions agencies should ask before scaling AI

The right questions at this stage aren’t about which model to choose or how quickly to deploy. They’re about context.

1. What context does each AI experience need? Not every AI tool needs access to everything. Context should be selected based on the task, audience, service and risk level.

2. Who owns the context? If context affects AI outputs, someone needs to be accountable for keeping it accurate, current and appropriate.

3. How is context reviewed and approved? AI instructions and source materials shouldn’t bypass normal governance simply because they sit behind the scenes.

4. How is context scoped? Government content is rarely universal. Context may need to vary by service, jurisdiction, language, content type, user role or channel.

5. How do we know what context was used? Auditability matters. Agencies need ways to understand what information influenced an AI-generated response.

6. How do we reduce duplication? If every team creates its own prompts and uploads its own documents, inconsistency is inevitable. Shared context infrastructure reduces that risk.

7. How will context improve over time? Context management should support learning, testing, feedback and iteration, not just initial setup.

If an agency can’t answer most of these questions, context debt is already building.

Context Control Center and Drupal

These questions define the problem space behind Salsa's work on Context Control Center for Drupal.

Context Control Center is designed to help organisations manage AI context inside Drupal, rather than scattering it across prompts, code, documents and disconnected tools. For agencies already using Drupal to manage complex content ecosystems, this approach is a natural extension of governance patterns they already understand and rely on.

The goal isn’t simply to give AI "more content." The goal is to help Drupal-powered organisations provide AI systems with the right context, at the right time, for the right task.

Capabilities include:

Managing context as structured Drupal content with ownership and lifecycle
Scoping context by service, content type, language, audience or workflow
Connecting governed context to agents and AI workflows
Improving consistency across AI outputs and agentic experiences
Building toward context infrastructure rather than prompt sprawl

Rather than treating AI as a separate layer outside the content ecosystem, this approach brings AI closer to the systems, workflows and governance models agencies already rely on.

Context debt is a governance challenge

The next phase of government AI won’t be won by adopting the newest model or launching the fastest proof of concept.

It will depend on whether agencies can build trustworthy systems around AI. Context debt is easy to ignore during experimentation because the risks are often hidden. But as AI becomes embedded in public services, unmanaged context becomes a serious operational and governance issue.

Government agencies need AI systems that are accurate, transparent, accessible, secure and aligned with policy.

That starts with context.

Not more context. Not random context. Not context copied into prompts and forgotten.

Managed context.

Because in government AI, trustworthy content is necessary but not enough. The model reads what the operational layer gives it, and that layer needs governance too.

Trustworthy AI starts with trustworthy context.