CargoAdmin, Bureaucrats, Moderators (CommentStreams), fileuploaders, Interface administrators, newuser, Push subscription managers, Suppressors, Administrators
5,287
edits
| Line 103: | Line 103: | ||
* It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data). | * It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data). | ||
* Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft: | * Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft: | ||
Fills in the structured Cargo template fields (e.g., program name, sponsoring agency, authorizing legislation, funding amount). | |||
Writes a concise narrative summary for the MediaWiki page. | |||
Proposes relationships (e.g., "This program links to Statute X and Agency Y"). | |||
* Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved. | * Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved. | ||
| Line 111: | Line 111: | ||
* This is the "checker" or "fact-checker" model. | * This is the "checker" or "fact-checker" model. | ||
* It runs | * It runs independently after the generator finishes its draft. | ||
* It goes through every part of the draft step-by-step: | * It goes through every part of the draft step-by-step: | ||
Compares each claim, field value, and relationship directly against the original source documents. | |||
Scores for factual accuracy (e.g., does the funding number match exactly?). | |||
Checks citation completeness (is every key fact traceable?). | |||
Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing). | |||
* It gives an overall confidence score and flags any mismatches, gaps, or potential issues. | * It gives an overall confidence score and flags any mismatches, gaps, or potential issues. | ||
* If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision. | * If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision. | ||
| Line 124: | Line 124: | ||
* A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs). | * A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs). | ||
* By having one model | * By having one model create and a different model critically review, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines). | ||
* Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses. | * Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses. | ||
* In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality. | * In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality. | ||
| Line 130: | Line 130: | ||
In short: | In short: | ||
* Generator → Builds the draft from official sources. | |||
* Verifier → Double-checks it rigorously before anything goes live. | |||
=== TBD === | === TBD === | ||
edits