OpenGov summary: Difference between revisions

74 bytes removed ,  Yesterday at 23:57
Line 100: Line 100:
==== Generator AI ====
==== Generator AI ====


* - This is the "creator" or "drafter" model.
* This is the "creator" or "drafter" model.
* - It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
* It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
* - Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
* Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
**   - Fills in the structured **Cargo template fields** (e.g., program name, sponsoring agency, authorizing legislation, funding amount).
** Fills in the structured **Cargo template fields** (e.g., program name, sponsoring agency, authorizing legislation, funding amount).
**   - Writes a concise narrative summary for the MediaWiki page.
** Writes a concise narrative summary for the MediaWiki page.
**   - Proposes relationships (e.g., "This program links to Statute X and Agency Y").
** Proposes relationships (e.g., "This program links to Statute X and Agency Y").
* - Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.
* Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.


==== Verifier AI ====
==== Verifier AI ====


* - This is the "checker" or "fact-checker" model.
* This is the "checker" or "fact-checker" model.
* - It runs **independently** after the generator finishes its draft.
* It runs **independently** after the generator finishes its draft.
* - It goes through every part of the draft step-by-step:
* It goes through every part of the draft step-by-step:
**   - Compares each claim, field value, and relationship directly against the original source documents.
** Compares each claim, field value, and relationship directly against the original source documents.
**   - Scores for factual accuracy (e.g., does the funding number match exactly?).
** Scores for factual accuracy (e.g., does the funding number match exactly?).
**   - Checks citation completeness (is every key fact traceable?).
** Checks citation completeness (is every key fact traceable?).
**   - Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
** Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
* - It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
* It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
* - If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
* If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
* - If there's disagreement or low confidence, the item flags for quick human review (one-click approve/reject/retry on the Clearance Dashboard).
* If there's disagreement or low confidence, the item flags for quick human review (one-click approve/reject/retry on the Clearance Dashboard).


==== Why This Two-Step Approach? ====
==== Why This Two-Step Approach? ====


* - A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
* A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
* - By having one model **create** and a different model **critically review**, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
* By having one model **create** and a different model **critically review**, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
* - Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
* Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
* - In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.
* In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.


In short:  
In short:  


* - **Generator** → Builds the draft from official sources.  
* **Generator** → Builds the draft from official sources.  
* - **Verifier** → Double-checks it rigorously before anything goes live.  
* **Verifier** → Double-checks it rigorously before anything goes live.


=== TBD ===
=== TBD ===