MPowerUP Guardian AI¶

A founding design principle and long-term capability roadmap.

Validation status: [HYPOTHESIS] — The four pillars are design commitments. The capabilities below are intended behaviors, not implemented or tested features. Claims about protective outcomes (reduced exploitation, improved service navigation) are untested theories. See Red Team Analysis for challenges.

Assumptions We're Betting On¶

These assumptions are load-bearing. If any is wrong, the Guardian AI as specified fails — not just adjusts.

Assumption	Status	What would falsify it
Users will invoke the agent (consent-first means passive protection doesn't exist)	`[HYPOTHESIS]`	Pilot shows < 20% of users ever tap "Ask Guardian" in a real-stress scenario
Plain-language translation at 6th-grade level is sufficient for the target population	`[HYPOTHESIS]`	User testing shows participants still misunderstand key benefit/legal content after translation
Scam detection can identify threats without false positives that flag legitimate offers	`[HYPOTHESIS]`	No heuristics defined yet; no test dataset from this population exists
Ollama runs acceptably on budget Android ($50–150) with partial connectivity	`[HYPOTHESIS]`	Benchmark testing; not yet performed on target hardware
The Guardian AI increases trust in MPowerUP rather than creating dependency or anxiety	`[HYPOTHESIS]`	Qualitative user research with pilot participants

Vision¶

MPowerUP's users — people navigating recovery, reentry, houselessness, and poverty — are disproportionately targeted by exploitative tech, algorithmic discrimination, and digital complexity. Confusing ToS agreements, predatory offers inside messaging apps, opaque government benefit systems, and scam networks all extract value from people who can least afford to lose it.

The Guardian AI is MPowerUP's answer: an AI agent designed not to monetize users, but to protect them. It acts as a personal advocate — translating complexity into plain language, flagging threats before they land, and navigating the bureaucratic maze on the user's behalf.

This is not a chatbot. It is a superhero in the user's pocket.

Four Pillars¶

These principles govern every Guardian AI design decision — and inform the broader MPowerUP product, even before the AI layer exists.

Pillar	What it means
Consent-first	The user invites the agent. It never runs passively, never reports back to third parties, never initiates a conversation without a prompt.
Offline-resilient	Works on budget Android with partial connectivity. On-device inference preferred; cloud fallback only when explicitly accepted.
Privacy-preserving	No surveillance. Data stays on-device or within the user's Circle. The guardian never becomes a data collection tool.
Plain-language	Every output written at a 6th-grade reading level. No jargon. Concrete next steps, not abstract summaries.

Capabilities¶

Capability	What it does
Plain-language translator	Converts benefits letters, legal notices, app ToS, and government forms into plain English with actionable next steps
Scam detector	Flags suspicious messages, links, and requests inside Circles before the user acts on them
Service navigator	Finds shelter beds, food banks, legal aid, and medical clinics within the user's area — tonight, not in theory
Token advocate	Explains MPWR earnings, redemption options, and tax implications in language users can act on
Digital literacy tutor	Explains what app permissions, data requests, and consent dialogs actually mean before the user agrees

Phased Implementation¶

The Guardian AI is a founding principle, not a Day 1 feature. The philosophy shapes every design decision now. The live agent deploys when MPowerUP has real users and real data to learn from.

Phase	Milestone	Guardian AI Work
Phase 1–3	Core app, hardening	Design language locked in — consent-first, offline-resilient, privacy-preserving applied to all non-AI features
Phase 3.5	MPWR token launch	Token advocate concept validated by MPWR UX — plain-language earnings/redemption flows
Phase 4.5	Guardian AI v1	First live agent: Claude API (cloud) + Ollama offline fallback. Plain-language translator + scam detector inside Circles
Phase 5+	Guardian AI v2	On-device inference, MCP tool integrations (government portals, service APIs), multi-step agentic workflows

Technical Approach¶

Phase 4.5 — Cloud-First with Offline Fallback¶

Claude API (Anthropic) — primary inference for plain-language translation, scam detection, service navigation
Ollama — local model fallback for offline or low-connectivity scenarios (same pattern as RlivN)
User controls which backend is active; default is on-device when available

Phase 5+ — On-Device and Agentic¶

On-device small model (quantized, ARM-optimized) for core capabilities with no data leaving the device
Model Context Protocol (MCP) tools for multi-step workflows: filling benefit applications, querying government service APIs, navigating portals on the user's behalf
Agent harness options: Anthropic Managed Agents SDK, or headless TypeScript agent framework (e.g. withastro/flue) for Cloudflare Workers / Node.js orchestration

What Guardian AI is NOT¶

Not a general-purpose chatbot
Not a data collection layer
Not a replacement for human facilitators inside Circles
Not a feature that is gated behind a paid tier

Why This Matters for Grant Strategy¶

Documenting the Guardian AI as a founding principle now — even before implementation — strengthens MPowerUP's position with mission-aligned funders:

NLnet / NGI Zero Commons Fund — explicitly values privacy-preserving, consent-first AI for underserved populations
Mozilla Foundation Democracy × AI Cohort — focuses on AI that protects rather than extracts from vulnerable users
USDC / DOJ Second Chance Act — reentry-focused; a scam detector and service navigator are direct program outcomes

Grant applications can cite Guardian AI as evidence of intentional design, not an afterthought.

Known Unknowns¶

Things we know we don't know yet. These are not blocking Phase 1–3, but must be resolved before Phase 4.5 development begins.

Scam detection heuristics: What signals define a "scam" inside a Circle? Who updates the model when new scam patterns emerge?
Facilitator conflict: What happens when the Guardian AI flags a message from a Circle facilitator? Does the user trust the agent or the facilitator? No design decision exists.
Offline model size/performance: Which Ollama model, at what quantization, runs within the memory and battery constraints of a budget Android device?
Integration point in UX: How does the user invoke the agent? In-message button? Separate tab? Long-press on a suspicious message? No UX spec exists.
Service data accuracy: The service navigator depends on OpenStreetMap/Overpass API data. Shelter beds and food bank hours change daily. How stale is "too stale" for a person who is hungry tonight?
Conflict resolution: If the agent says "this offer looks suspicious" and the user disagrees, what happens? The agent has no authority, but the user may have followed its false-positive and missed real help.
The automation bias trap: Georgetown CSET (2024) documents that automated safety tools cause users to treat silence (no flag) as a safety clearance — and high system accuracy increases this effect. A user who has seen Guardian AI flag external threats correctly will develop a heuristic that facilitator silence means safety. This is exactly wrong for insider threats. The UX must explicitly communicate what Guardian AI does not watch for, at equal prominence to what it does. A facilitator report mechanism is the primary safety layer for insider threats — Guardian AI cannot substitute for it.

Cross-cutting design question (not yet asked anywhere in the vault):

The five risk areas documented in the Red Team Analysis — SSI benefit cliff harm, Guardian AI automation bias, facilitator predation, P2P delivery failure, and token pricing erosion of mutual aid — all concentrate harm on the same users at the same moments: highest cognitive load, highest trust in the system, lowest ability to recognize or recover from harm.

This is not five independent risks. It is a compounding profile. The design question it raises has not been answered:

What does MPowerUP look like if it is designed specifically to fail gracefully under cognitive overload, rather than assuming users will engage protective mechanisms correctly under stress?

Answering this before Phase 3.5 would change the product significantly. It is the most important open design question in the vault.

MPowerUP Token Economy (MPWR) — token advocate capability is grounded here
MPWR Research & Feasibility
MPowerUP Supporting Research — prosocial behavior and vulnerable populations research base
RlivN — shares the Claude API + Ollama offline pattern