OpenGov summary
OpenGov Encyclopedia - Executive Summary / Sales Pitch
The U.S. federal government manages one of the world's largest and most complex organizational landscapes: thousands of agencies, sub-agencies, programs, authorizing statutes, funding flows, and cross-cutting initiatives. Existing public assets like USA.gov (citizen front door), Search.gov (federated search), and USAspending.gov (spending transparency) provide essential services—but they don't deliver the unified, machine-readable semantic layer that modern agency AI systems desperately need.
Agency LLMs and chatbots are currently "starving" for reliable ground truth. Most rely on scraping inconsistent .gov websites or parsing unstructured PDFs, leading to frequent hallucinations, fragmented answers, and reduced public trust in government digital services.
OpenGov Encyclopedia closes this critical gap as a supplemental, lightweight knowledge infrastructure — never a replacement or competitor to existing .gov platforms.
Core Dual Purpose
- Citizen-centric interface: Wikipedia-style narrative pages (MediaWiki base + USWDS federal skin) organized around real tasks people want to accomplish (e.g., "Prepare for a Wildland Fire," "Access Housing Assistance," "Navigate Federal AI Opportunities & Regulations"). Each page provides clear context, relationships, and eligibility hints — then immediately directs users to the official agency or USA.gov destination to act.
- API-first knowledge graph: Structured, queryable data (via Cargo extension) capturing precise typed relationships (e.g., "which agency sponsors this program?", "what legislation authorizes it?", "what funding connects them?"). This becomes high-quality "fuel" for agency RAG pipelines, reducing hallucinations and enabling parametric searches (e.g., "all active programs >$50M related to climate resilience").
Knowledge graph
This page in a nutshell: In short: OpenGov Encyclopedia combines Wikipedia-style readability with database-like structure. Humans get helpful narrative pages that point to official sources; machines/AI get a clean, connected, verifiable graph of federal entities and relationships — all without duplicating agency content. |
OpenGov Encyclopedia uses MediaWiki (the same software that powers Wikipedia) combined with the Cargo extension to function as a lightweight knowledge graph. This setup provides both human-readable pages (like Wikipedia articles) and machine-readable, structured, queryable data — perfect for serving as a "truth layer" that feeds clean information to agency AI systems while helping citizens understand federal structures.
Here's a simple, step-by-step explanation for someone not familiar with these tools:
1. What is a Knowledge Graph? (Quick Basics)
A knowledge graph is like a smart map of information:
- Nodes = things (entities), e.g., "FEMA", "Disaster Assistance Program", "Stafford Act".
- Edges = relationships between them, e.g., "FEMA sponsors → Disaster Assistance Program", "Disaster Assistance Program is authorized by → Stafford Act", "Stafford Act funds flow to → multiple agencies".
- The power comes from being able to ask questions across connections, like "Show me all programs authorized by laws passed after 2010 that FEMA sponsors and that have >$100M funding."
Traditional databases or spreadsheets can store facts, but knowledge graphs excel at revealing connections and enabling complex, relationship-based searches.
2. How MediaWiki + Cargo Creates This
- MediaWiki handles the "Wikipedia-like" part:
- Each federal entity gets its own page (e.g., a page called "Federal Emergency Management Agency" or "Wildfire Mitigation Grant Program").
- Pages have readable narrative text, history, discussion tabs, and look official with a USWDS (U.S. Web Design System) skin.
- This makes it citizen-friendly — people read summaries, see context, and get directed to official .gov links.
- Cargo turns those pages into structured, graph-like data:
- Templates act like fill-in-the-blank forms. For example, a "Federal Program" template might have fields like:
- Program Name
- Sponsoring Agency (links to the Agency page)
- Authorizing Legislation (links to the Statute page)
- Annual Funding Amount
- Eligibility Summary
- Primary Official URL (always points back to the real .gov site)
- Status (Active / Proposed / Expired)
- When someone (or the AI pipeline) fills in the template on a page, Cargo automatically stores the answers in database tables — one table per template type.
- Example: All "Federal Program" templates feed into a single "Programs" table in the background, with columns matching the fields.
- Templates act like fill-in-the-blank forms. For example, a "Federal Program" template might have fields like:
This is where the knowledge graph emerges:
- Because fields can link to other pages (e.g., "Sponsoring Agency" contains a link to the "FEMA" page), Cargo knows relationships exist.
- The system doesn't need fancy triple-store tech (like RDF/OWL in heavier graphs); it uses simple relational tables but supports joins, list fields, and hierarchy traversal to mimic graph behavior.
3. Querying the Graph — The Real Power
Cargo lets anyone (humans or machines via API/JSON exports) run queries across the data. These queries reveal connections automatically.
Examples in OpenGov Encyclopedia context:
- Simple lookup: "List all active programs sponsored by NOAA with funding >$50 million."
- Cargo scans the "Programs" table, filters on Sponsor = "NOAA", Funding > 50M, Status = "Active".
- Relationship traversal (graph-like):
- "Show every program authorized by legislation containing '42 U.S.C.' that is sponsored by an agency in the Department of the Interior."
- Joins the Programs table to Agencies table (via Sponsor field) and checks the Authorizing Legislation field.
- "Show every program authorized by legislation containing '42 U.S.C.' that is sponsored by an agency in the Department of the Interior."
- Cross-entity discovery:
- "Find all disaster-related programs connected to FEMA, including their authorizing laws and related agencies."
- Uses joins and list fields (e.g., if a program has multiple linked agencies or statutes).
- "Find all disaster-related programs connected to FEMA, including their authorizing laws and related agencies."
- Dynamic pages:
- A task-oriented page like "Prepare for Wildland Fire" can embed live query results: a table of relevant programs, pulled fresh from Cargo data, with links back to official sites.
Queries can output as:
- Tables/lists on wiki pages
- Maps (if coordinates are stored)
- JSON/CSV exports (for feeding agency LLMs or dashboards)
- Inline dynamic content (updates whenever source data changes)
4. Why This Counts as a Knowledge Graph (Even If Lightweight)
- It has entities (pages/nodes) and typed relationships (via template fields and links).
- You can traverse connections using joins, HOLDS (for lists), WITHIN (for hierarchies), etc.
- It's queryable at scale — supports parametric searches agencies need for AI (e.g., "programs related to arid land agriculture with >$50M funding").
- Data stays fresh and attested — tied to official .gov sources via the AI pipeline, with confidence scores and always-link-back banners.
- Unlike heavier graphs (e.g., Neo4j or RDF stores used in some federal pilots), it's simple, open-source, low-maintenance, and integrated with readable wiki pages.
Safeguards
OpenGov Encyclopedia is engineered from the ground up to deliver authoritative, trustworthy, and neutral structured data as a supplemental truth layer—while strictly adhering to federal compliance, risk management, and public trust standards.
Authoritative sourcing only
All content ingestion is locked to verified official federal sources, eliminating external risks. This includes:
- public .gov websites from CISA's current-federal.csv whitelist (~1,000+ executive branch entries, covering major agencies like EPA, NASA, FEMA, NOAA, and their sub-agencies/subdomains such as airnow.gov or noaa.gov sub-sites),
- Federal Register API (for regulations and notices)
- eCFR (electronic Code of Federal Regulations)
- USAspending.gov APIs (for funding and awards data)
- Agency-specific databases and feeds (e.g., FEMA declarations, NOAA data portals, HHS TAGGS grants)
Note: While the U.S. Digital Registry was considered for social media validation, it has been deprecated since September 2024 and is no longer updated, so it is not incorporated.
Dual-AI pipeline
This page in a nutshell: Together, they ensure the knowledge graph and wiki pages are as reliable as possible for citizens and agency AI systems. |
In the dual-AI pipeline used by OpenGov Encyclopedia, two large language models (LLMs) work together in a structured, collaborative process to create and check content. This setup is designed to produce accurate, reliable summaries, relationships, and structured data from official .gov sources while minimizing errors like hallucinations (where an AI invents details).
The pipeline has two main roles:
- Generator AI
- Verifier AI
Generator AI
- This is the "creator" or "drafter" model.
- It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
- Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
- Fills in the structured Cargo template fields (e.g., program name, sponsoring agency, authorizing legislation, funding amount). Writes a concise narrative summary for the MediaWiki page. Proposes relationships (e.g., "This program links to Statute X and Agency Y").
- Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.
Verifier AI
- This is the "checker" or "fact-checker" model.
- It runs independently after the generator finishes its draft.
- It goes through every part of the draft step-by-step:
- Compares each claim, field value, and relationship directly against the original source documents.
- Scores for factual accuracy (e.g., does the funding number match exactly?).
- Checks citation completeness (is every key fact traceable?).
- Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
- It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
- If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
- If there's disagreement or low confidence, the item flags for quick human review (one-click approve/reject/retry on the Clearance Dashboard).
Why This Two-Step Approach?
- A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
- By having one model create and a different model critically review, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
- Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
- In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.
In short:
- Generator → Builds the draft from official sources.
- Verifier → Double-checks it rigorously before anything goes live.
Zero-base burden model
The zero-base burden model is the core operational philosophy of OpenGov Encyclopedia: design the system so that human effort is minimized to near-zero for routine operations, while still maintaining full federal control, compliance, and accountability. This approach draws from federal priorities for efficient, low-touch AI governance (as emphasized in OMB guidance like M-25-21 on accelerating AI adoption through innovation and reduced bureaucracy, and related 2025-2026 directives promoting agile, cost-effective AI deployment without unnecessary administrative overhead).
These features ensure full compliance with FOIA (easy retrieval of historical versions and decision trails) and NARA records management requirements (permanent, auditable preservation of changes without manual intervention).
This model delivers high freshness and broad coverage with virtually no ongoing manual workload — aligning with federal goals for efficient AI use (e.g., reducing bureaucratic barriers while preserving safeguards). It lets limited staff focus on strategic oversight rather than day-to-day maintenance, making OpenGov Encyclopedia sustainable and scalable across agencies. If piloted successfully, it could serve as a blueprint for other low-touch federal knowledge initiatives.
In practice, this means:
~80–95% fully automated processing
The vast majority of content creation, updates, and maintenance happens without any human intervention. The active generator AI in the dual pipeline monitors official sources continuously. When a change is detected (e.g., a new program announcement in the Federal Register, an updated funding figure on USAspending.gov, or a revised agency page on a whitelisted .gov subdomain), the system automatically:
- Triggers re-processing of the affected entity/page.
- Retrieves the fresh content via RAG.
- Generates a draft (filling Cargo fields and narrative text).
- Runs it through the verifier AI for cross-check.
- Publishes approved changes as a new MediaWiki revision if confidence thresholds are met.
This event-driven architecture ensures the knowledge graph stays current in near-real-time for high-signal changes (e.g., major legislation or funding updates), without scheduled batch jobs overwhelming resources.
Daily gap scans for completeness
A lightweight nightly automated scan identifies "missing" entities or gaps in existing ones (e.g., a new sub-agency subdomain appears in the CISA .gov inventory, or a program referenced in multiple sources but lacking a dedicated page). The pipeline proactively creates or enhances pages for these, starting with core verifiable fields. This builds out the inventory progressively without manual queues.
Progressive completeness
Not every field needs to be perfect on day one. The system prioritizes:
- Core fields (always populated if verifiable): Entity name, sponsoring agency, primary .gov link, status, and basic relationships — these form the reliable backbone of the knowledge graph.
- Optional/enhanced fields (e.g., detailed eligibility criteria, historical funding trends, cross-program links): These fill in over time as additional source evidence emerges and verification confidence grows (e.g., from 80% → 98%). Low-confidence or unverified details are clearly marked (e.g., "Pending confirmation" or blank with a note), ensuring transparency rather than forcing incomplete rejection.
This "good enough to start, improve over time" strategy maximizes coverage quickly while upholding accuracy.
Human involvement strictly limited to <5% escalations
Humans (authorized federal staff) are only involved in exceptional cases:
- Verifier flags a discrepancy or low confidence on a high-impact item (e.g., a major program change affecting public services).
- Random audit samples for oversight.
- Edge cases like ambiguous source data.
Escalations route to a simple Clearance Dashboard — a custom MediaWiki special page or integrated tool — where staff review side-by-side diffs (draft vs. sources), then click one button: Approve, Reject, or Request Retry (with optional note). No writing, editing, or content creation is required from humans. This keeps the burden minimal (often 1-2 minutes per case) and scalable even as the graph grows to thousands of entities.
No manual writing required
Federal staff never draft, rewrite, or curate text/narrative. All content originates from AI synthesis of official sources, verified through the dual pipeline. This eliminates the traditional "content team" workload that plagues many government wikis or databases.
Built-in compliance and traceability features
- Immutable MediaWiki revisions: Every published version is permanently stored with timestamps, attribution (e.g., "GrokBot" or "GeminiBot" username for AI contributions), and diffs.
- Cargo data snapshots: Structured fields are versioned alongside pages, preserving historical states for queries or audits.
- Signed audit logs: All automated actions (ingestion, generation, verification, publish) are logged with digital signatures, timestamps, and source references — fully queryable and exportable.
- MediaWiki page history: Open to public view (or restricted as needed), allowing anyone to see the complete change timeline, compare versions, and understand evolution over time.
Always defers to originals
OpenGov Encyclopedia is built on the principle that it is never the authoritative source. Every page explicitly directs users back to the original federal .gov site(s) for verification, actions, applications, or any official purpose. This respects agency ownership, prevents confusion or duplication, and builds citizen trust through transparency.
In short, OpenGov Encyclopedia enhances discovery and understanding while always honoring the single source of truth on the original .gov—exactly what citizens and agencies need in an AI-powered era.
Every page incorporates clear, consistent, USWDS-styled elements that are impossible to miss. These use proven patterns like alerts for notices and footers for reassurance, adapted to MediaWiki's capabilities:
Top banner (prominent notice)
- A high-visibility alert-style box appears at the top of every page.
- Text: This material is provided for background context and relationships only and is not the official record.
- Styled using a custom MediaWiki template with USWDS-inspired classes (added via modified skin CSS in MediaWiki:Common.css or the USWDS-integrated skin file). Includes ARIA attributes for accessibility (role="status" or role="region" with aria-label).
Direct source link
- Placed immediately below the banner or integrated into the Cargo infobox/header.
- Text: Official source of truth: [Primary .gov URL] — complete actions and verify details there.
- Rendered as a large, clickable USWDS-style button or link (e.g., usa-button usa-button--primary). Multiple sources are listed cleanly if applicable.
Action buttons
Every relevant section ends with directive buttons:
- Apply Now →
- Learn More / Eligibility Details →
- Take Action / Submit Application →
- Styled as USWDS usa-button (primary or outline variants). All point exclusively to the official agency site, USA.gov task page, or form endpoint—no internal completion paths.
(persistent reassurance with dynamic Cargo data)
- At the bottom of every page, styled in the USWDS identifier/footer pattern.
- Uses a Cargo-powered template to display live, queryable metadata:
- Dual-AI verified from official sources
- Last verified: [timestamp from Cargo field]
- Confidence: [score from Cargo field, e.g., 98%]
- Always check primary .gov for authoritative information.
Cargo integration:
Verification fields (e.g., LastVerified=Date, Confidence=Float) are declared in the Cargo table. The footer template queries the current page’s data (using |where=_pageName=OpenGov summary) to populate values dynamically.
This enables site-wide queries, such as sorting articles by verification date or confidence rating:
{{#cargo_query:
tables=VerificationMetadata
|fields=Page=Page, LastVerified=Last Verified, Confidence=Confidence
|order by=LastVerified DESC
|limit=50
}}
(Can be embedded on a dashboard or oversight page.)
All elements are automatic (baked into page templates and skin), consistent across the site, and meet WCAG 2.1 AA / Section 508 standards (high-contrast, keyboard-navigable, ARIA roles).
This design turns a potential concern into a strength.
- Citizens see plain-English, repeated messaging that makes the supplemental role crystal clear from the first second.
- Agencies retain full ownership: OpenGov Encyclopedia never competes—it acts as a helpful map that sends users straight to the right .gov destination, better informed and ready to act.
- Traffic funnels to primary sites: Prominent, task-oriented buttons improve completion rates on official pages.
- Trust is reinforced: Dynamic verification details in the footer show real-time quality (timestamp + confidence score), while the Cargo backend allows easy auditing and reporting on freshness and accuracy across thousands of pages.
Why Now?
LLMs are starving for clean, structured “ground truth.” Most RAG pipelines today scrape inconsistent .gov websites or parse PDFs, leading to frequent hallucinations and unreliable outputs.
OpenGov Encyclopedia closes this gap by providing:
- Precise, typed relationships (which agency sponsors which program? Which legislation authorizes it? What funding flows connect them?)
- Parametric search (e.g., “all active programs with >$50M funding related to arid land agriculture”)
- Real-time freshness from monitored sources
- Full audit trail and human attestation for compliance
Federal AI adoption is accelerating under OMB guidance and America's AI Action Plan — but fragmentation and poor data quality undermine it. Existing federal knowledge graph efforts (e.g., CDO Council's Fuels Knowledge Graph for wildland fire metrics, USGS GeoKB for geospatial semantics, NASA people/mission graphs) prove the power of structured relationships but remain domain-siloed or internal. OpenGov Encyclopedia unifies this at low cost: open-source stack (MediaWiki + Cargo), near-zero new infra, phased pilot in high-value areas (e.g., disaster resilience, small business/housing assistance — cross-agency priorities).
Complements the ecosystem
| Platform | Primary Role | How OpenGov Encyclopedia Complements It |
|-----------------------|---------------------------------------------------|-----------------------------------------------------------------------------|
| Agency websites | Primary authoritative content and services | Creates concise summaries + relationship maps; always links back to the original agency page as the source of truth |
| USA.gov | Citizen front door for navigation & task completion | Provides deep context and task-oriented discovery so users reach USA.gov (or agency sites) better informed and ready to act |
| Search.gov | On-site search across federal domains | Supplies structured entities and JSON-LD sitemaps for richer, more accurate results |
| USAspending.gov | Raw spending & award data | Adds narrative explanations and program relationships around the numbers |
PlatformRoleHow OpenGov Encyclopedia HelpsAgency websitesPrimary content & servicesAdds concise summaries + cross-agency relationship maps; always links backUSA.govCitizen navigation & task starting pointProvides deeper context so users arrive informed & ready; boosts task completionSearch.govFederated search across .govSupplies structured entities/JSON-LD for richer resultsUSAspending.govRaw spending dataLayers narrative explanations, program connections around the numbers
Practical, Task-Oriented Value
The platform is organized around real tasks people want to accomplish, not just agency names. Examples of built-in topic/task pages include:
- Prepare for a disaster (links relevant FEMA, NOAA space weather, and HHS programs + direct USA.gov action links)
- Understand AI regulations and opportunities (cross-agency view of NIST standards, grant programs, and policy updates)
- Find housing or small-business assistance (structured program finder with sponsor, eligibility hints, and official application links)
- Research space weather impacts (connects NOAA monitoring, research programs, and emergency response frameworks)
Every task page gives context and relationships, then immediately directs users to the official agency or USA.gov page to complete the action.
Zero-Base Burden Strategy (80/20 Governance Model)
- 80% Automated Orchestration — Grok monitors a small, curated list of high-signal sources. When a change is detected, it auto-drafts the update and populates Cargo fields.
- 20% Human Attestation — Staff simply click “Approve” on the Clearance Dashboard. No writing required.
- Result — 90%+ reduction in manual labor while maintaining full federal control and compliance.
Cost & Next Steps
- Near-zero new infrastructure cost (MediaWiki + Cargo open-source; Grok already available via OneGov).
- Phased pilot on high-value task areas (AI initiatives, space weather, disaster preparedness, housing assistance) in weeks.
Next Steps (Executive-Actionable)
1. Proof of Concept Review — View the live “AI Policy & Space Weather Task” prototype running on MediaWiki + Cargo.
2. Feasibility Brief — 30-minute technical call with GSA OneGov leads to confirm Grok-to-wiki pipeline interoperability.
3. Governance Workshop — Define “Single Source of Truth” protocols that respect agency content ownership and prevent duplication.
Closing
OpenGov Encyclopedia is not another website — it is the clean, structured fuel for the federal AI ecosystem and a helpful map that respects agency ownership while helping citizens accomplish real tasks.
We are prepared to demonstrate the prototype and tailor the approach to your priorities.