OpenGov Technical: Difference between revisions
No edit summary |
No edit summary |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
[[OpenGov summary]] | [[OpenGov summary]] | ||
= Technical Document: How OpenGov Encyclopedia Would Work = | |||
== 1. High-Level Architecture == | |||
External Sources (Public Only) | External Sources (Public Only) | ||
| Line 37: | Line 38: | ||
└── Audit / revision tags table | └── Audit / revision tags table | ||
== '''2. Seeding the Initial Content (Phase 1: Weeks 1–4)''' == | |||
'''2. Seeding the Initial Content (Phase 1: Weeks 1–4)''' | |||
=== '''Step 1: Infrastructure Setup (Week 1)''' === | |||
* Deploy MediaWiki + Cargo + USWDS-aligned skin on FedRAMP hosting. | * Deploy MediaWiki + Cargo + USWDS-aligned skin on FedRAMP hosting. | ||
* Define core Cargo tables (Agency, Program, Organization, Topic). | * Define core Cargo tables (Agency, Program, Organization, Topic). | ||
* Configure Grok bot account via GSA OneGov. | * Configure Grok bot account via GSA OneGov. | ||
'''Step 2: Seed Agencies & Major Organizations (Weeks 2–3)''' | === '''Step 2: Seed Agencies & Major Organizations (Weeks 2–3)''' === | ||
* Use the official USA.gov A-Z Agency Index + agency “About” pages as the seed list (~150–200 URLs). | * Use the official USA.gov A-Z Agency Index + agency “About” pages as the seed list (~150–200 URLs). | ||
* Grok batch job (one-time run): | * Grok batch job (one-time run): | ||
| Line 201: | Line 52: | ||
* All items route to Clearance Dashboard for batch attestation (high-confidence auto-pass after validation). | * All items route to Clearance Dashboard for batch attestation (high-confidence auto-pass after validation). | ||
'''Step 3: Seed Programs & Initiatives (Weeks 3–4)''' | === '''Step 3: Seed Programs & Initiatives (Weeks 3–4)''' === | ||
* Sources: Federal Register notices, USAspending.gov API, curated agency program pages. | * Sources: Federal Register notices, USAspending.gov API, curated agency program pages. | ||
* Grok prompt: “Create Program page from this source. Fill Cargo: sponsor, purpose, start_date, duration, funding, related agencies. Output wikitext + Cargo. Cite source.” | * Grok prompt: “Create Program page from this source. Fill Cargo: sponsor, purpose, start_date, duration, funding, related agencies. Output wikitext + Cargo. Cite source.” | ||
* Review via Clearance Dashboard. | * Review via Clearance Dashboard. | ||
'''Step 4: Seed Task/Topic Pages (Week 4)''' | === '''Step 4: Seed Task/Topic Pages (Week 4)''' === | ||
* Grok uses seeded data: “Create ‘Prepare for a Disaster’ task page linking relevant programs/agencies. Add plain-language explanation and official USA.gov links.” | * Grok uses seeded data: “Create ‘Prepare for a Disaster’ task page linking relevant programs/agencies. Add plain-language explanation and official USA.gov links.” | ||
* Links always point back to agency/USA.gov as source of truth. | * Links always point back to agency/USA.gov as source of truth. | ||
'''3. Ongoing Updates (Post-Week 4)''' | == '''3. Ongoing Updates (Post-Week 4)''' == | ||
* Daily Grok orchestration monitors curated high-signal list + Federal Register + USAspending API. | * Daily Grok orchestration monitors curated high-signal list + Federal Register + USAspending API. | ||
* Change detected → Grok proposes update → Clearance Dashboard. | * Change detected → Grok proposes update → Clearance Dashboard. | ||
| Line 219: | Line 67: | ||
* No broad scraping — only targeted, public, high-value pages. | * No broad scraping — only targeted, public, high-value pages. | ||
'''4. Technical Safeguards''' | == '''4. Technical Safeguards''' == | ||
* '''Scalability''' — Cargo for narrative/relationships; PostgreSQL offload for high-volume numerics if needed. | * '''Scalability''' — Cargo for narrative/relationships; PostgreSQL offload for high-volume numerics if needed. | ||
* '''Auditing''' — Mandatory revision tags (source URL, timestamp, Grok score, attestation ID) in separate table. | * '''Auditing''' — Mandatory revision tags (source URL, timestamp, Grok score, attestation ID) in separate table. | ||
Latest revision as of 14:21, 4 March 2026
Technical Document: How OpenGov Encyclopedia Would Work
1. High-Level Architecture
External Sources (Public Only)
├── Federal Register API / RSS
├── USAspending.gov V2 API
├── Curated high-signal agency pages (~200–300 URLs)
└── Agency-submitted URLs/PDFs (via simple form)
↓ (scheduled Lambda or webhook)
Orchestration Layer (AWS GovCloud Lambda or similar)
├── Grok (GSA OneGov API – inherited FedRAMP controls)
├── Optional secondary FedRAMP LLM (consistency check)
├── Cargo validation rules + Redis cache
└── Clearance Dashboard (simple internal MediaWiki page or lightweight app)
↓ (only approved changes)
MediaWiki Core (FedRAMP-authorized hosting)
├── Citizen-Centric Pages (USWDS-integrated skin)
├── Cargo tables (API-first knowledge graph)
├── MediaWiki API (for Grok bot edits)
└── Audit / revision tags table
2. Seeding the Initial Content (Phase 1: Weeks 1–4)
Step 1: Infrastructure Setup (Week 1)
- Deploy MediaWiki + Cargo + USWDS-aligned skin on FedRAMP hosting.
- Define core Cargo tables (Agency, Program, Organization, Topic).
- Configure Grok bot account via GSA OneGov.
Step 2: Seed Agencies & Major Organizations (Weeks 2–3)
- Use the official USA.gov A-Z Agency Index + agency “About” pages as the seed list (~150–200 URLs).
- Grok batch job (one-time run):
- Prompt: “From this official agency page, extract: name, parent, mission summary, website, key sub-components, leadership. Output valid wikitext + Cargo fields. Cite source URL. Confidence score required.”
- Grok creates pages + populates Cargo tables.
- All items route to Clearance Dashboard for batch attestation (high-confidence auto-pass after validation).
Step 3: Seed Programs & Initiatives (Weeks 3–4)
- Sources: Federal Register notices, USAspending.gov API, curated agency program pages.
- Grok prompt: “Create Program page from this source. Fill Cargo: sponsor, purpose, start_date, duration, funding, related agencies. Output wikitext + Cargo. Cite source.”
- Review via Clearance Dashboard.
Step 4: Seed Task/Topic Pages (Week 4)
- Grok uses seeded data: “Create ‘Prepare for a Disaster’ task page linking relevant programs/agencies. Add plain-language explanation and official USA.gov links.”
- Links always point back to agency/USA.gov as source of truth.
3. Ongoing Updates (Post-Week 4)
- Daily Grok orchestration monitors curated high-signal list + Federal Register + USAspending API.
- Change detected → Grok proposes update → Clearance Dashboard.
- Agency staff forward URLs via form → Grok processes → attestation.
- No broad scraping — only targeted, public, high-value pages.
4. Technical Safeguards
- Scalability — Cargo for narrative/relationships; PostgreSQL offload for high-volume numerics if needed.
- Auditing — Mandatory revision tags (source URL, timestamp, Grok score, attestation ID) in separate table.
- Transparency — Page badges show “AI-drafted – human-attested” + Trust Score tooltip.
- Accessibility — Early WCAG 2.2 AA audit; ARIA landmarks.
- Security — FedRAMP hosting, inherited controls from OneGov.
This setup ensures OpenGov Encyclopedia starts small, grows intelligently, stays accurate, and never competes with agency sites or USA.gov — it only adds context and connections that drive users back to the official sources.