Provenance and attribution for mass submissions
A submission-record view that collapses each campaign to a single representative entry, discloses how many submissions it represented and who organized it, and marks whether each submission arrived through a verified route. It reports provenance and size; it does not try to classify which submissions were machine-written.
When submissions to a consultation, grant round, or petition can be drafted and filed by AI agents at scale, an agency has to weigh a body of input it cannot authenticate at the source. It cannot assume a submission was written by the person who filed it, that distinct-looking submissions came from distinct people, or that a high count reflects wide support.
The challenge is to recover enough provenance and attribution from mass submissions (which campaign, whose, and filed how) that the agency can report the record honestly and defend the weight it gives it, without discounting legitimate coordinated advocacy.
The forward problem is attribution at scale, not detecting which submissions a machine wrote.
Government needs enough confidence in a body of submissions to act on it and stand behind the decision: that organized campaigns are identified and collapsed to a single representative entry with their size and organizer disclosed, that submissions filed through a verified route can be told apart from unverified ones, and that duplicates and automated filings are visible as such. The confidence comes from provenance and source validation, not from judging whether a given submission was machine-written.
The people most at risk are legitimate coordinated advocates: if an integrity measure treats campaign volume itself as suspect, a real constituency's submissions can be discounted or collapsed out of the record the agency reads. Keep the path open by attributing a campaign rather than discarding it, counting its participants alongside the single representative entry, and giving an organizer a route to confirm authorship. A submission filed without the verified route must be treated as data for follow-up, not grounds for rejection, so source validation never becomes an identity barrier.
Publish one representative entry for a campaign and disclose its size, organizer, and verification status, so the record shows a campaign as a single attributed entry rather than thousands of separate-looking submissions.
No surface has been built yet; the approach above is the brief for one.
- Established Headline
Problem recognition, representative-version reporting, and verified-source intake are already in practice.
- Emerging
Attribution of agent-assisted submissions at scale is still taking shape.
High, with jurisdictional adaptation. The APA's notice-and-comment framework is US-specific, but every jurisdiction running public consultations faces the same structural challenge. The UK, Australian, and Canadian governments all encounter campaign responses and must decide how to report and weight them. The FCC case is a cautionary tale with universal applicability: any system that accepts unverified submissions at scale is vulnerable to manipulation.
Stating the limits of the submission data publicly, and verifying provenance before acting, is the discipline that catches a flawed count. Absent it, an automated calculation stands unchallenged.
10 references
-
ACUS Recommendation 2021-1: Mass, Computer-Generated, and Fraudulent Comments
The Administrative Conference of the United States addressed three categories of problematic comments — mass comments orchestrated by campaign organizations, computer-generated comments, and "malattributed" comments filed using stolen or fabricated identities — and recommended best practices for managing each category while preserving the right to participate.
-
Nextgov: House bill targets AI-generated comments (Comment Integrity and Management Act)
A House-passed bill — not enacted law; it lapsed with the 118th Congress — that would have required agencies to publish a single representative version of mass comments, publicly state the number of computer-generated submissions, and tasked OMB with guidance and GAO with reporting on AI-generated comment prevalence.
-
NY Attorney General report on fake net neutrality comments
The New York Attorney General's investigation found nearly 18 million of the 22 million FCC comments were fake: roughly 8.5 million used the names and addresses of real people without their consent (a broadband-industry-funded campaign), and about 9.3 million used fabricated identities, most from a single 19-year-old using automated software. Three firms later paid US$615,000 in penalties.
-
GAO-21-103181: comment integrity
In a survey across ten agencies, GAO estimated that the share of commenters whose email addresses confirmed their submissions ranged from 48% to 87%, and that 5–30% of email addresses on the record were attached to comments their owners said they did not make — and recommended agencies and GSA fully describe these limitations publicly.