Voice-fidelity preservation
A pre-submission review screen that shows the citizen, in plain language, exactly what their agent changed, including any change to the substance of what they said, before anything is sent.
When an AI agent rewrites a citizen's complaint, submission or application, it smooths, formalizes and homogenizes.
Research demonstrates this is not neutral: LLMs privilege majority statistical regularities at the expense of minoritized forms through a process described as "dimensional collapse," an attractive force toward dominant modes within the latent space. A 2026 CHI paper, "When AI Writes, Whose Voice Remains?", quantified cultural marker erasure across World English varieties.
LLM rewriting reduces lexical diversity, strips dialectal markers, and imposes a formal register that may change the substance of what was said. Minoritized linguistic features (such as AAVE syntax) are often flagged as requiring "correction."
Government needs confidence that a submission reaching an agency still represents what the citizen actually said, not a version the agent reshaped on the way. Meeting that requires the agent to carry the citizen's original input alongside its structured version and to flag when a rewrite has changed the substantive position, so neither the citizen nor the agency loses sight of the original voice.
Voice-fidelity preservation requires the citizen to review agent changes, itself a literacy-dependent task. A diff display assumes reading competence and an audio summary assumes hearing, so review must be offered across modalities (text, audio, plain language, interpreter) rather than a single channel.
Your letter, with the assistant’s changes
4 of 4 changes appliedDear City Housing Office, assistant's wording, Register change: My grandmother and I have been waiting six months for the housing transfer. assistant's wording, Voice / cultural marker change: This delay is unacceptable. She assistant's wording, Register change: is no longer able to manage the stairs, and the office assistant's wording, Substance change: has repeatedly failed to respond substantively.
Underlined spans are where the assistant changed your words. Nothing was changed silently.
- Frontier Headline
The research documenting the problem is recent (2025-2026), with the CHI 2026 paper quantifying cultural-marker erasure providing the empirical foundation; no known system implements voice-fidelity preservation as a design requirement, and the interaction design that surfaces and flags substantive rewrites to the citizen has no established precedent.
LLM sycophancy research. The documented tendency of LLMs toward sycophancy (tailoring responses to what they predict the user wants to hear rather than what is accurate) compounds the voice-fidelity problem. An agent that rewrites a complaint to sound more "professional" may also moderate its force, remove emotional content that conveys urgency, or reframe a demand as a request. Research on GPT's ability to predict user behavior found "critical failures to reflect user patterns, with significantly different distributions from real data in 53% of tasks."
Gender bias in AI-generated professional documents. Research has found that AI-generated reference letters exhibit gender biases in language professionalism, excellency and agency: male candidates are described with more "professional" language. This demonstrates that AI "improvement" of text is not ideologically neutral; it encodes existing hierarchies.
Direct, though few working examples exist yet. In a government services context, the voice-fidelity problem appears when a citizen's complaint about housing conditions is rewritten into bureaucratic language that strips the lived urgency, or when an appeal is "improved" into a form that changes its legal character. The citizen may not understand what was changed, and the receiving agency may not know the submission was agent-mediated.
A voice-fidelity protocol would carry both the structured version and the citizen's original input, let the receiving system surface the original when the structured version is ambiguous, show citizens a plain-language diff before submission, and require agents to flag when a rewrite changes the substantive position.
Showing a citizen with low literacy, or whose first language is not English, that their agent changed the meaning is the hardest part, because a diff display assumes reading competence and an audio summary assumes hearing. This is where the stronger the assurance a pattern demands, the more people it risks excluding.
The failure mode is an agent silently reframing a complaint or appeal in a way that misrepresents the person to the agency. Surfacing what the agent changed, and flagging when a rewrite alters the citizen's substantive position, prevents it.
6 references
- CHI 2026, "When AI Writes, Whose Voice Remains?"
- Stanford, "How AI is leaving non-English speakers behind"
- WEF, "How can we design AI agents for a world of many voices?"
- Sharma et al., "Towards Understanding Sycophancy in Language Models" (ICLR 2024) (preprint)
- Predicting user behaviour with GPT (arXiv) (preprint)
- Gender bias in AI-generated reference letters (arXiv)