OpenGov summary

From USApedia
Revision as of 20:29, 8 March 2026 by OpenBook (talk | contribs) (→‎Complements the ecosystem)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

OpenGov Encyclopedia - Executive Summary / Sales Pitch

The U.S. federal government manages one of the world's largest and most complex organizational landscapes: thousands of agencies, sub-agencies, programs, authorizing statutes, funding flows, and cross-cutting initiatives. Existing public assets like USA.gov (citizen front door), Search.gov (federated search), and USAspending.gov (spending transparency) provide essential services—but they don't deliver the unified, machine-readable semantic layer that modern agency AI systems desperately need.

Agency LLMs and chatbots are currently "starving" for reliable ground truth. Most rely on scraping inconsistent .gov websites or parsing unstructured PDFs, leading to frequent hallucinations, fragmented answers, and reduced public trust in government digital services.

OpenGov Encyclopedia closes this critical gap as a supplemental, lightweight knowledge infrastructure — never a replacement or competitor to existing .gov platforms.

Core Dual Purpose

  • Citizen-centric interface: Wikipedia-style narrative pages (MediaWiki base + USWDS federal skin) organized around real tasks people want to accomplish (e.g., "Prepare for a Wildland Fire," "Access Housing Assistance," "Navigate Federal AI Opportunities & Regulations"). Each page provides clear context, relationships, and eligibility hints — then immediately directs users to the official agency or USA.gov destination to act.
  • API-first knowledge graph: Structured, queryable data (via Cargo extension) capturing precise typed relationships (e.g., "which agency sponsors this program?", "what legislation authorizes it?", "what funding connects them?"). This becomes high-quality "fuel" for agency RAG pipelines, reducing hallucinations and enabling parametric searches (e.g., "all active programs >$50M related to climate resilience").

Knowledge graph

OpenGov Encyclopedia uses MediaWiki (the same software that powers Wikipedia) combined with the Cargo extension to function as a lightweight knowledge graph. This setup provides both human-readable pages (like Wikipedia articles) and machine-readable, structured, queryable data — perfect for serving as a "truth layer" that feeds clean information to agency AI systems while helping citizens understand federal structures.

Here's a simple, step-by-step explanation for someone not familiar with these tools:

1. What is a Knowledge Graph? (Quick Basics)

A knowledge graph is like a smart map of information:

  • Nodes = things (entities), e.g., "FEMA", "Disaster Assistance Program", "Stafford Act".
  • Edges = relationships between them, e.g., "FEMA sponsors → Disaster Assistance Program", "Disaster Assistance Program is authorized by → Stafford Act", "Stafford Act funds flow to → multiple agencies".
  • The power comes from being able to ask questions across connections, like "Show me all programs authorized by laws passed after 2010 that FEMA sponsors and that have >$100M funding."

Traditional databases or spreadsheets can store facts, but knowledge graphs excel at revealing connections and enabling complex, relationship-based searches.

2. How MediaWiki + Cargo Creates This

  • MediaWiki handles the "Wikipedia-like" part:
    • Each federal entity gets its own page (e.g., a page called "Federal Emergency Management Agency" or "Wildfire Mitigation Grant Program").
    • Pages have readable narrative text, history, discussion tabs, and look official with a USWDS (U.S. Web Design System) skin.
    • This makes it citizen-friendly — people read summaries, see context, and get directed to official .gov links.
  • Cargo turns those pages into structured, graph-like data:
    • Templates act like fill-in-the-blank forms. For example, a "Federal Program" template might have fields like:
      • Program Name
      • Sponsoring Agency (links to the Agency page)
      • Authorizing Legislation (links to the Statute page)
      • Annual Funding Amount
      • Eligibility Summary
      • Primary Official URL (always points back to the real .gov site)
      • Status (Active / Proposed / Expired)
    • When someone (or the AI pipeline) fills in the template on a page, Cargo automatically stores the answers in database tables — one table per template type.
      • Example: All "Federal Program" templates feed into a single "Programs" table in the background, with columns matching the fields.

This is where the knowledge graph emerges:

  • Because fields can link to other pages (e.g., "Sponsoring Agency" contains a link to the "FEMA" page), Cargo knows relationships exist.
  • The system doesn't need fancy triple-store tech (like RDF/OWL in heavier graphs); it uses simple relational tables but supports joins, list fields, and hierarchy traversal to mimic graph behavior.

3. Querying the Graph — The Real Power

Cargo lets anyone (humans or machines via API/JSON exports) run queries across the data. These queries reveal connections automatically.

Examples in OpenGov Encyclopedia context:

  • Simple lookup: "List all active programs sponsored by NOAA with funding >$50 million."
    • Cargo scans the "Programs" table, filters on Sponsor = "NOAA", Funding > 50M, Status = "Active".
  • Relationship traversal (graph-like):
    • "Show every program authorized by legislation containing '42 U.S.C.' that is sponsored by an agency in the Department of the Interior."
      • Joins the Programs table to Agencies table (via Sponsor field) and checks the Authorizing Legislation field.
  • Cross-entity discovery:
    • "Find all disaster-related programs connected to FEMA, including their authorizing laws and related agencies."
      • Uses joins and list fields (e.g., if a program has multiple linked agencies or statutes).
  • Dynamic pages:
    • A task-oriented page like "Prepare for Wildland Fire" can embed live query results: a table of relevant programs, pulled fresh from Cargo data, with links back to official sites.

Queries can output as:

  • Tables/lists on wiki pages
  • Maps (if coordinates are stored)
  • JSON/CSV exports (for feeding agency LLMs or dashboards)
  • Inline dynamic content (updates whenever source data changes)

4. Why This Counts as a Knowledge Graph (Even If Lightweight)

  • It has entities (pages/nodes) and typed relationships (via template fields and links).
  • You can traverse connections using joins, HOLDS (for lists), WITHIN (for hierarchies), etc.
  • It's queryable at scale — supports parametric searches agencies need for AI (e.g., "programs related to arid land agriculture with >$50M funding").
  • Data stays fresh and attested — tied to official .gov sources via the AI pipeline, with confidence scores and always-link-back banners.
  • Unlike heavier graphs (e.g., Neo4j or RDF stores used in some federal pilots), it's simple, open-source, low-maintenance, and integrated with readable wiki pages.

Safeguards

OpenGov Encyclopedia is engineered from the ground up to deliver authoritative, trustworthy, and neutral structured data as a supplemental truth layer—while strictly adhering to federal compliance, risk management, and public trust standards.

Authoritative sourcing only

All content ingestion is locked to verified official federal sources, eliminating external risks. This includes:

  • public .gov websites from CISA's current-federal.csv whitelist (~1,000+ executive branch entries, covering major agencies like EPA, NASA, FEMA, NOAA, and their sub-agencies/subdomains such as airnow.gov or noaa.gov sub-sites),
  • Federal Register API (for regulations and notices)
  • eCFR (electronic Code of Federal Regulations)
  • USAspending.gov APIs (for funding and awards data)
  • Agency-specific databases and feeds (e.g., FEMA declarations, NOAA data portals, HHS TAGGS grants)

Note: While the U.S. Digital Registry was considered for social media validation, it has been deprecated since September 2024 and is no longer updated, so it is not incorporated.

Dual-AI pipeline

In the dual-AI pipeline used by OpenGov Encyclopedia, two large language models (LLMs) work together in a structured, collaborative process to create and check content. This setup is designed to produce accurate, reliable summaries, relationships, and structured data from official .gov sources while minimizing errors like hallucinations (where an AI invents details).

The pipeline has two main roles:

  • Generator AI
  • Verifier AI

Generator AI

  • This is the "creator" or "drafter" model.
  • It starts by reading the retrieved official content (e.g., text from a Federal Register notice, an agency program page, or USAspending data).
  • Using retrieval-augmented generation (RAG) techniques, it synthesizes that information into a draft:
    • Fills in the structured Cargo template fields (e.g., program name, sponsoring agency, authorizing legislation, funding amount). Writes a concise narrative summary for the MediaWiki page. Proposes relationships (e.g., "This program links to Statute X and Agency Y").
    • Its job is to be creative and comprehensive—turning raw source material into coherent, usable wiki content and graph data—while staying grounded in what was retrieved.

Verifier AI

  • This is the "checker" or "fact-checker" model.
  • It runs independently after the generator finishes its draft.
  • It goes through every part of the draft step-by-step:
    • Compares each claim, field value, and relationship directly against the original source documents.
    • Scores for factual accuracy (e.g., does the funding number match exactly?).
    • Checks citation completeness (is every key fact traceable?).
    • Evaluates logical consistency and neutrality (no unsupported assumptions or biased phrasing).
  • It gives an overall confidence score and flags any mismatches, gaps, or potential issues.
  • If both AIs agree at a high threshold (≥95% confidence), the draft auto-publishes as a new page revision.
  • If there's disagreement or low confidence, the item flags for quick human review (one-click approve/reject/retry on the Clearance Dashboard).

Why This Two-Step Approach?

  • A single AI can sometimes confidently produce wrong or invented details (a common issue in LLMs).
  • By having one model create and a different model critically review, the system catches more errors—studies on multi-agent or dual-LLM verification show significant reductions in hallucinations (often 60-90% in similar pipelines).
  • Alternating roles (e.g., Grok drafts one time, Gemini verifies; next time they swap) adds extra robustness by avoiding patterns from one model's weaknesses.
  • In OpenGov Encyclopedia, this keeps the process fast and mostly automated (~80-95% hands-off) while meeting federal needs for defensibility, traceability, and neutrality.

In short:

  • Generator → Builds the draft from official sources.
  • Verifier → Double-checks it rigorously before anything goes live.

Zero-base burden model

The zero-base burden model is the core operational philosophy of OpenGov Encyclopedia: design the system so that human effort is minimized to near-zero for routine operations, while still maintaining full federal control, compliance, and accountability. This approach draws from federal priorities for efficient, low-touch AI governance (as emphasized in OMB guidance like M-25-21 on accelerating AI adoption through innovation and reduced bureaucracy, and related 2025-2026 directives promoting agile, cost-effective AI deployment without unnecessary administrative overhead).

These features ensure full compliance with FOIA (easy retrieval of historical versions and decision trails) and NARA records management requirements (permanent, auditable preservation of changes without manual intervention).

This model delivers high freshness and broad coverage with virtually no ongoing manual workload — aligning with federal goals for efficient AI use (e.g., reducing bureaucratic barriers while preserving safeguards). It lets limited staff focus on strategic oversight rather than day-to-day maintenance, making OpenGov Encyclopedia sustainable and scalable across agencies. If piloted successfully, it could serve as a blueprint for other low-touch federal knowledge initiatives.

In practice, this means:

~80–95% fully automated processing  

The vast majority of content creation, updates, and maintenance happens without any human intervention. The active generator AI in the dual pipeline monitors official sources continuously. When a change is detected (e.g., a new program announcement in the Federal Register, an updated funding figure on USAspending.gov, or a revised agency page on a whitelisted .gov subdomain), the system automatically:

  • Triggers re-processing of the affected entity/page.
  • Retrieves the fresh content via RAG.
  • Generates a draft (filling Cargo fields and narrative text).
  • Runs it through the verifier AI for cross-check.
  • Publishes approved changes as a new MediaWiki revision if confidence thresholds are met.

This event-driven architecture ensures the knowledge graph stays current in near-real-time for high-signal changes (e.g., major legislation or funding updates), without scheduled batch jobs overwhelming resources.

Daily gap scans for completeness  

A lightweight nightly automated scan identifies "missing" entities or gaps in existing ones (e.g., a new sub-agency subdomain appears in the CISA .gov inventory, or a program referenced in multiple sources but lacking a dedicated page). The pipeline proactively creates or enhances pages for these, starting with core verifiable fields. This builds out the inventory progressively without manual queues.

Progressive completeness  

Not every field needs to be perfect on day one. The system prioritizes:

  • Core fields (always populated if verifiable): Entity name, sponsoring agency, primary .gov link, status, and basic relationships — these form the reliable backbone of the knowledge graph.
  • Optional/enhanced fields (e.g., detailed eligibility criteria, historical funding trends, cross-program links): These fill in over time as additional source evidence emerges and verification confidence grows (e.g., from 80% → 98%). Low-confidence or unverified details are clearly marked (e.g., "Pending confirmation" or blank with a note), ensuring transparency rather than forcing incomplete rejection.

This "good enough to start, improve over time" strategy maximizes coverage quickly while upholding accuracy.

Human involvement strictly limited to <5% escalations  

Humans (authorized federal staff) are only involved in exceptional cases:

  • Verifier flags a discrepancy or low confidence on a high-impact item (e.g., a major program change affecting public services).
  • Random audit samples for oversight.
  • Edge cases like ambiguous source data.

Escalations route to a simple Clearance Dashboard — a custom MediaWiki special page or integrated tool — where staff review side-by-side diffs (draft vs. sources), then click one button: Approve, Reject, or Request Retry (with optional note). No writing, editing, or content creation is required from humans. This keeps the burden minimal (often 1-2 minutes per case) and scalable even as the graph grows to thousands of entities.

No manual writing required  

Federal staff never draft, rewrite, or curate text/narrative. All content originates from AI synthesis of official sources, verified through the dual pipeline. This eliminates the traditional "content team" workload that plagues many government wikis or databases.

Built-in compliance and traceability features  

  • Immutable MediaWiki revisions: Every published version is permanently stored with timestamps, attribution (e.g., "GrokBot" or "GeminiBot" username for AI contributions), and diffs.
  • Cargo data snapshots: Structured fields are versioned alongside pages, preserving historical states for queries or audits.
  • Signed audit logs: All automated actions (ingestion, generation, verification, publish) are logged with digital signatures, timestamps, and source references — fully queryable and exportable.
  • MediaWiki page history: Open to public view (or restricted as needed), allowing anyone to see the complete change timeline, compare versions, and understand evolution over time.

Always defers to originals

OpenGov Encyclopedia is built on the principle that it is never the authoritative source. Every page explicitly directs users back to the original federal .gov site(s) for verification, actions, applications, or any official purpose. This respects agency ownership, prevents confusion or duplication, and builds citizen trust through transparency.

In short, OpenGov Encyclopedia enhances discovery and understanding while always honoring the single source of truth on the original .gov—exactly what citizens and agencies need in an AI-powered era.

Every page incorporates clear, consistent, USWDS-styled elements that are impossible to miss. These use proven patterns like alerts for notices and footers for reassurance, adapted to MediaWiki's capabilities:

Top banner (prominent notice)

  • A high-visibility alert-style box appears at the top of every page.
  • Text: This material is provided for background context and relationships only and is not the official record.
  • Styled using a custom MediaWiki template with USWDS-inspired classes (added via modified skin CSS in MediaWiki:Common.css or the USWDS-integrated skin file). Includes ARIA attributes for accessibility (role="status" or role="region" with aria-label).

Direct source link

  • Placed immediately below the banner or integrated into the Cargo infobox/header.
  • Text: Official source of truth: [Primary .gov URL] — complete actions and verify details there.
  • Rendered as a large, clickable USWDS-style button or link (e.g., usa-button usa-button--primary). Multiple sources are listed cleanly if applicable.

Action buttons

Every relevant section ends with directive buttons:

  • Apply Now →
  • Learn More / Eligibility Details →
  • Take Action / Submit Application →
  • Styled as USWDS usa-button (primary or outline variants). All point exclusively to the official agency site, USA.gov task page, or form endpoint—no internal completion paths.

Footer

(persistent reassurance with dynamic Cargo data)

  • At the bottom of every page, styled in the USWDS identifier/footer pattern.
  • Uses a Cargo-powered template to display live, queryable metadata:
    • Dual-AI verified from official sources
    • Last verified: [timestamp from Cargo field]
    • Confidence: [score from Cargo field, e.g., 98%]
    • Always check primary .gov for authoritative information.

Cargo integration:

Verification fields (e.g., LastVerified=Date, Confidence=Float) are declared in the Cargo table. The footer template queries the current page’s data (using |where=_pageName=OpenGov summary) to populate values dynamically.

This enables site-wide queries, such as sorting articles by verification date or confidence rating:

{{#cargo_query:
tables=VerificationMetadata
|fields=Page=Page, LastVerified=Last Verified, Confidence=Confidence
|order by=LastVerified DESC
|limit=50
}}

(Can be embedded on a dashboard or oversight page.)

All elements are automatic (baked into page templates and skin), consistent across the site, and meet WCAG 2.1 AA / Section 508 standards (high-contrast, keyboard-navigable, ARIA roles).

This design turns a potential concern into a strength.

  • Citizens see plain-English, repeated messaging that makes the supplemental role crystal clear from the first second.
  • Agencies retain full ownership: OpenGov Encyclopedia never competes—it acts as a helpful map that sends users straight to the right .gov destination, better informed and ready to act.
  • Traffic funnels to primary sites: Prominent, task-oriented buttons improve completion rates on official pages.
  • Trust is reinforced: Dynamic verification details in the footer show real-time quality (timestamp + confidence score), while the Cargo backend allows easy auditing and reporting on freshness and accuracy across thousands of pages.

Why Now?  

Large language models (LLMs) and agency AI systems are starving for clean, structured “ground truth” data. Most retrieval-augmented generation (RAG) pipelines today rely on scraping inconsistent .gov websites, parsing outdated PDFs, or pulling from fragmented sources. This leads to frequent hallucinations, unreliable outputs, incomplete answers, and eroded public trust in government digital services.

OpenGov Encyclopedia closes this critical gap right when federal AI adoption is accelerating dramatically. Under President Trump's leadership, the White House released **America's AI Action Plan** ("Winning the Race: America's AI Action Plan") in July 2025, following Executive Order 14179 ("Removing Barriers to American Leadership in Artificial Intelligence") in January 2025. This national strategy—built on pillars of accelerating innovation, building AI infrastructure, and leading in international diplomacy and security—directs aggressive federal action to drive AI dominance, including faster adoption across government.

Key enablers include:

  • OMB Memorandum M-25-21 (April 2025): "Accelerating Federal Use of AI through Innovation, Governance, and Public Trust," which rescinds prior restrictive guidance, empowers agencies to innovate responsibly, remove barriers, develop AI strategies, and prioritize efficient AI deployment while maintaining safeguards for privacy, civil rights, and public trust.
  • Related OMB memos (e.g., M-25-22 on efficient AI acquisition, M-26-04 on unbiased AI principles) and initiatives like GSA's USAi platform (launched August 2025) to provide secure, no-cost AI tools government-wide.

These policies create urgency: agencies are now required to build AI maturity, pilot high-impact uses, procure unbiased models, and scale AI for better public services—yet fragmentation and poor-quality data (messy websites, unstructured PDFs) undermine progress and risk hallucinations in mission-critical applications.

Existing federal knowledge graph efforts demonstrate the power of structured relationships but remain limited:

  • CDO Council’s Fuels Knowledge Graph (wildland fire metrics, interagency performance).
  • USGS GeoKB (geospatial semantics and topographic data integration).
  • NASA people/mission graphs (skills discovery, workforce planning).

These are valuable but domain-siloed, internal-focused, or narrow in scope—none provide a unified, citizen-facing, API-first semantic layer that feeds clean data to dozens of agency LLMs while always deferring to originals.

OpenGov Encyclopedia unifies and scales this capability at **near-zero incremental cost**:

  • Built on proven open-source stack (MediaWiki + Cargo).
  • Leverages existing GSA OneGov agreement for Grok orchestration.
  • No new infrastructure needed.
  • Phased pilot in high-value, cross-agency areas (e.g., disaster resilience and climate adaptation, small business/housing assistance programs) can launch in weeks, delivering immediate wins for AI accuracy and citizen task support.

The timing is perfect: with America's AI Action Plan and OMB guidance pushing rapid, responsible adoption, OpenGov Encyclopedia provides the missing "ground truth" fuel—precise typed relationships, parametric search capabilities (e.g., “all active programs with >$50M funding related to arid land agriculture”), real-time freshness from monitored official sources, and full audit trails/human attestation for compliance—enabling agencies to move faster without the risks of bad data. This is the moment to bridge the gap and turn federal AI potential into reliable reality.

Complements the ecosystem

OpenGov Encyclopedia is explicitly designed as a supplemental layer—not a replacement or competitor. It respects the distinct roles of existing federal digital assets and actively enhances them by providing context, relationships, and structured data that make those platforms more effective for both citizens and agency AI systems.

Platform Primary Role How OpenGov Encyclopedia Complements It
Agency websites Primary authoritative content and services Creates concise summaries and cross-agency relationship maps; always links back to the original agency page as the single source of truth. Users get the "big picture" quickly, then move directly to the agency site for official details and actions.
USA.gov Citizen front door for navigation & task completion Provides deep context and task-oriented discovery (e.g., cross-program views for "Prepare for a Disaster" or "Find Housing Assistance"). Users arrive at USA.gov (or linked agency pages) better informed, more confident, and ready to complete tasks faster—boosting overall completion rates and satisfaction.
Search.gov Federated search across federal domains Supplies structured entities, typed relationships, and JSON-LD structured data/sitemaps. This enables richer, more precise search results (e.g., better entity recognition, relationship-based ranking, and reduced irrelevant hits) across the entire .gov ecosystem.
USAspending.gov Raw spending & award data Layers narrative explanations, program context, authorizing legislation, and funding-flow relationships around the raw numbers. Citizens and analysts understand "why" and "how" the dollars connect to programs and agencies, turning data into actionable insight without duplicating the source.

Key benefits of this complementary approach

  • No duplication of effort or content: Every page defers explicitly to originals—OpenGov Encyclopedia adds value through synthesis and connectivity, never recreating primary services or data.
  • Improved citizen experience: Users discover related programs, understand connections, and navigate faster because the encyclopedia acts as a "smart map" that points them to the right official destination.
  • Better AI performance across government: By feeding clean, structured, attested data (with full provenance), it reduces hallucinations in agency chatbots, RAG pipelines, and internal tools that pull from or reference USA.gov, Search.gov, or agency sites.
  • Ecosystem synergy: Structured outputs (e.g., JSON-LD, Cargo queries) make federal search and navigation smarter; narrative context makes raw data (like USAspending) more understandable; cross-agency views make siloed agency sites feel more cohesive.

In essence, OpenGov Encyclopedia amplifies what already works in the federal digital ecosystem—USA.gov as the welcoming front door, agency sites as the authoritative homes, Search.gov as the discovery engine, and USAspending.gov as the transparency ledger—while filling the one missing piece: a lightweight, unified, machine- and human-readable layer of context and relationships that makes everything more useful, accurate, and trustworthy.

Practical, Task-Oriented Value  

The platform is organized around real tasks people want to accomplish, not just agency names. Examples of built-in topic/task pages include:

- Prepare for a disaster (links relevant FEMA, NOAA space weather, and HHS programs + direct USA.gov action links)

- Understand AI regulations and opportunities (cross-agency view of NIST standards, grant programs, and policy updates)

- Find housing or small-business assistance (structured program finder with sponsor, eligibility hints, and official application links)

- Research space weather impacts (connects NOAA monitoring, research programs, and emergency response frameworks)

Every task page gives context and relationships, then immediately directs users to the official agency or USA.gov page to complete the action.

Zero-Base Burden Strategy (80/20 Governance Model)  

  • 80% Automated Orchestration — Grok monitors a small, curated list of high-signal sources. When a change is detected, it auto-drafts the update and populates Cargo fields.
  • 20% Human Attestation — Staff simply click “Approve” on the Clearance Dashboard. No writing required.
  • Result — 90%+ reduction in manual labor while maintaining full federal control and compliance.

Cost & Next Steps  

- Near-zero new infrastructure cost (MediaWiki + Cargo open-source; Grok already available via OneGov).

- Phased pilot on high-value task areas (AI initiatives, space weather, disaster preparedness, housing assistance) in weeks.

Next Steps (Executive-Actionable)  

1. Proof of Concept Review — View the live “AI Policy & Space Weather Task” prototype running on MediaWiki + Cargo.  

2. Feasibility Brief — 30-minute technical call with GSA OneGov leads to confirm Grok-to-wiki pipeline interoperability.  

3. Governance Workshop — Define “Single Source of Truth” protocols that respect agency content ownership and prevent duplication.

Closing

OpenGov Encyclopedia is not another website — it is the clean, structured fuel for the federal AI ecosystem and a helpful map that respects agency ownership while helping citizens accomplish real tasks.

We are prepared to demonstrate the prototype and tailor the approach to your priorities.