OpenGov Technical
Technical Document: How OpenGov Encyclopedia Would Work
1. High-Level Architecture
External Sources (Public Only)
├── Federal Register API / RSS
├── USAspending.gov V2 API
├── Curated high-signal agency pages (~200–300 URLs)
└── Agency-submitted URLs/PDFs (via simple form)
↓ (scheduled Lambda or webhook)
Orchestration Layer (AWS GovCloud Lambda or similar)
├── Grok (GSA OneGov API – inherited FedRAMP controls)
├── Optional secondary FedRAMP LLM (consistency check)
├── Cargo validation rules + Redis cache
└── Clearance Dashboard (simple internal MediaWiki page or lightweight app)
↓ (only approved changes)
MediaWiki Core (FedRAMP-authorized hosting)
├── Citizen-Centric Pages (USWDS-integrated skin)
├── Cargo tables (API-first knowledge graph)
├── MediaWiki API (for Grok bot edits)
└── Audit / revision tags table
2. Seeding the Initial Content (Phase 1: Weeks 1–4)
Step 1: Infrastructure Setup (Week 1)
- Deploy MediaWiki + Cargo + USWDS-aligned skin on FedRAMP hosting.
- Define core Cargo tables (Agency, Program, Organization, Topic).
- Configure Grok bot account via GSA OneGov.
Step 2: Seed Agencies & Major Organizations (Weeks 2–3)
- Use the official USA.gov A-Z Agency Index + agency “About” pages as the seed list (~150–200 URLs).
- Grok batch job (one-time run):
- Prompt: “From this official agency page, extract: name, parent, mission summary, website, key sub-components, leadership. Output valid wikitext + Cargo fields. Cite source URL. Confidence score required.”
- Grok creates pages + populates Cargo tables.
- All items route to Clearance Dashboard for batch attestation (high-confidence auto-pass after validation).
Step 3: Seed Programs & Initiatives (Weeks 3–4)
- Sources: Federal Register notices, USAspending.gov API, curated agency program pages.
- Grok prompt: “Create Program page from this source. Fill Cargo: sponsor, purpose, start_date, duration, funding, related agencies. Output wikitext + Cargo. Cite source.”
- Review via Clearance Dashboard.
Step 4: Seed Task/Topic Pages (Week 4)
- Grok uses seeded data: “Create ‘Prepare for a Disaster’ task page linking relevant programs/agencies. Add plain-language explanation and official USA.gov links.”
- Links always point back to agency/USA.gov as source of truth.
3. Ongoing Updates (Post-Week 4)
- Daily Grok orchestration monitors curated high-signal list + Federal Register + USAspending API.
- Change detected → Grok proposes update → Clearance Dashboard.
- Agency staff forward URLs via form → Grok processes → attestation.
- No broad scraping — only targeted, public, high-value pages.
4. Technical Safeguards
- Scalability — Cargo for narrative/relationships; PostgreSQL offload for high-volume numerics if needed.
- Auditing — Mandatory revision tags (source URL, timestamp, Grok score, attestation ID) in separate table.
- Transparency — Page badges show “AI-drafted – human-attested” + Trust Score tooltip.
- Accessibility — Early WCAG 2.2 AA audit; ARIA landmarks.
- Security — FedRAMP hosting, inherited controls from OneGov.
This setup ensures OpenGov Encyclopedia starts small, grows intelligently, stays accurate, and never competes with agency sites or USA.gov — it only adds context and connections that drive users back to the official sources.