OpenGov summary: Difference between revisions

38 bytes removed ,  Wednesday at 23:58
Line 103: Line 103:
* It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
* It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
* Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
* Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
** Fills in the structured **Cargo template fields** (e.g., program name, sponsoring agency, authorizing legislation, funding amount).
Fills in the structured Cargo template fields (e.g., program name, sponsoring agency, authorizing legislation, funding amount).
** Writes a concise narrative summary for the MediaWiki page.
Writes a concise narrative summary for the MediaWiki page.
** Proposes relationships (e.g., "This program links to Statute X and Agency Y").
Proposes relationships (e.g., "This program links to Statute X and Agency Y").
* Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.
* Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.


Line 111: Line 111:


* This is the "checker" or "fact-checker" model.
* This is the "checker" or "fact-checker" model.
* It runs **independently** after the generator finishes its draft.
* It runs independently after the generator finishes its draft.
* It goes through every part of the draft step-by-step:
* It goes through every part of the draft step-by-step:
** Compares each claim, field value, and relationship directly against the original source documents.
Compares each claim, field value, and relationship directly against the original source documents.
** Scores for factual accuracy (e.g., does the funding number match exactly?).
Scores for factual accuracy (e.g., does the funding number match exactly?).
** Checks citation completeness (is every key fact traceable?).
Checks citation completeness (is every key fact traceable?).
** Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
* It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
* It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
* If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
* If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
Line 124: Line 124:


* A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
* A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
* By having one model **create** and a different model **critically review**, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
* By having one model create and a different model critically review, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
* Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
* Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
* In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.
* In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.
Line 130: Line 130:
In short:  
In short:  


* **Generator** → Builds the draft from official sources.  
* Generator → Builds the draft from official sources.  
* **Verifier** → Double-checks it rigorously before anything goes live.
* Verifier → Double-checks it rigorously before anything goes live.


=== TBD ===
=== TBD ===