Red Team: MPowerUP — Adversarial Analysis¶

Purpose: Challenge the core theories and assumptions behind MPowerUP and MPWR before they become load-bearing in product, legal, or grant decisions. Every section below is a genuine counter-argument, not a strawman. Where the challenge is solvable, a mitigation path is noted. Where it is not yet solvable, the honest answer is: we don't know.

Status of this doc: [HYPOTHESIS] — these are the best challenges we can mount with available evidence. They are not predictions of failure. They are required inputs before claims in the vault docs can be marked [VALIDATED].

How to use: For each challenge, either (a) produce evidence that changes the analysis, (b) document a design decision that mitigates the risk, or (c) acknowledge it as an open bet and define the test that would resolve it.

Challenge 1: Tokenizing Mutual Aid May Destroy It¶

The theory being challenged: MPWR earns tokens for acts of mutual aid. This will increase participation and economic stability for participants.

The counter-evidence:

Gneezy & Rustichini (2000) — "Pay enough or don't pay at all" — documented that introducing payment for a previously norm-governed behavior decreased participation when the payment was too small to be economically meaningful. The babysitter late-pickup study is the canonical example. Titmuss (1970) showed that paying for blood donation reduced donation rates in the UK — the introduction of price converted a social norm into a market transaction, and market logic ("is this worth my time?") replaced social logic ("this is what decent people do").

MPWR earns 1 MPWR for creating a help request, 3 MPWR for responding. In Year 1, when the backing pool is near-zero (BNI LLC revenue is pre-launch), these tokens may have very low USD value. If 500 users are active and the pool is $500, MPWR = $0.001 each. Responding to a help request earns $0.003. That is the price of an act of care — and it is lower than the psychic reward of doing it for free.

The risk: MPWR retroactively prices mutual aid at $0.003 per act. People who would have helped for free now see their help quantified as worth less than a text message.

What would change this analysis: - Clear evidence that MPWR rates will be meaningful ($5–20/month for active participants) from Day 1 - A minimum viable backing pool (e.g., $50K) committed before Phase 3.5 launches - OR: a design change where tokens have no USD value until the pool reaches a threshold that makes them meaningful — the system is invisible until it's material

The honest position: We are applying contingency management theory (from clinical addiction treatment) to community mutual aid. These are different settings with different motivational dynamics. No peer-reviewed study validates this application. This is a hypothesis.

Challenge 2: MPWR Cash Income May Actively Harm Participants¶

The theory being challenged: MPWR cash redemption empowers participants financially. The in-app disclosure + WIPA partnership adequately addresses the benefits impact.

The counter-evidence:

SSI recipients receive approximately $943/month (2026 rate). The SSI earned income formula: benefit reduces by $0.50 for every $1 earned above $85/month. A participant who redeems $400 in MPWR during a month has their SSI reduced by approximately $168/month for that month. Net gain: $232. But:

This calculation requires the participant to understand and anticipate the offset before redeeming
People experiencing poverty, addiction recovery, or reentry face elevated cognitive load, bureaucratic complexity, and planning horizon constraints (present bias is well-documented in this population)
An in-app disclosure is not benefits counseling
WIPA partnership is not established and is blocked on funding — so Phase 3.5 would launch without it

The more serious scenario: a participant redeems $400 to cover rent. Their SSI is reduced $168/month for the next several months while the benefits agency processes the earnings report. They are now short $168/month. They had the $400 but spent it. The MPWR redemption created a short-term windfall followed by a structural hole.

ABLE accounts are an underexplored mitigation path: SSI recipients can shelter earned income in ABLE accounts (tax-advantaged, up to $18,000/year in many states, exempt from SSI asset tests in most states). This is not mentioned in the vault docs. An MPWR redemption routed into an ABLE account might preserve SSI eligibility while building savings. But ABLE account eligibility has its own requirements (disability onset before age 26) that may exclude many participants.

What would change this analysis: - A WIPA partner engaged before Phase 3.5 development begins (not just before launch) - A documented decision: will MPWR redemption ever be routed directly to ABLE accounts? - A pilot with 5–10 participants who are SSI recipients, with benefits counseling, before any broader launch - OR: a design decision to cap monthly MPWR redemption below the SSI threshold ($85) to prevent benefit reduction — and be transparent about this constraint

The honest position: The SSI/SNAP benefits impact is the single highest-stakes unresolved issue in MPowerUP. "Warning label" mitigations are insufficient for a financial instrument targeting SSI recipients. This requires a human expert partner, not documentation.

Challenge 3: DeFi Staking Is Not a Safe Financial Instrument for Vulnerable People's Money¶

The theory being challenged: The MPowerUP Community Fund deploys pooled participant contributions to DeFi staking protocols (Aave V3, Morpho Blue, Compound), generating 3–8% APY. This yield funds community growth and individual stake returns.

The counter-evidence:

USDC depeg risk: In March 2023, Circle disclosed $3.3 billion in SVB exposure. USDC dropped to $0.87 briefly before the US Treasury backstop. A 13% decline in Community Fund value overnight would affect real people who earned it through real acts of care. Stablecoin depeg is a documented and recurring event, not a tail risk.

DeFi yield compression: In 2022, Aave V2 USDC yields fell below 1% APY as DeFi activity contracted. The 3–8% figure is current market rate (May 2026). Participants who join in a bull period, invest their 30%, and then watch yield collapse to 0.5% in a bear market have had their Community Fund contributions impaired. They have no recourse and may not understand what happened.

Smart contract risk: The ERC-4626 Community Fund vault is noted as "unaudited; needs security review before mainnet." DeFi hacks have drained billions from audited contracts. The Nomad bridge ($190M), Ronin network ($625M), and many others were audited. An unaudited vault holding pooled money from people experiencing poverty is not a technology risk — it is an ethics risk. A hack could destroy years of accumulated community wealth for people who have no financial safety net.

The 10% BNI LLC revenue backing depends on BNI LLC existing: If BNI LLC fails before Phase B, the backing pool depletes. MPWR becomes worthless. There is no fiduciary trust structure, reserve requirement, or participant recourse documented. This is not a business risk — it is an obligation to vulnerable people who have earned something of value through real work.

What would change this analysis: - A separate legal entity (trust or LLC) holding the Community Fund before it exceeds $10K — documented in vault docs but not yet done - A hard decision on DeFi yield: is this appropriate for Year 1? Could Phase A Community Fund simply hold USDC (0% yield, but no yield risk) until the program proves itself? - Smart contract audit from a credentialed firm before any real funds enter the vault - Insurance coverage (Nexus Mutual, InsurACE) documented as a budget line item - Explicit documentation of what happens to participant stakes if BNI LLC fails

The honest position: DeFi staking introduces financial instrument risk into an app serving people with no financial margin. The yield upside is modest (3–8% on $300 annual contribution = $9–24/year). The downside risks are catastrophic. The risk-reward here deserves a harder look.

Challenge 4: P2P WebRTC May Not Be Reliable Enough for Crisis Use Cases¶

The theory being challenged: The libp2p WebRTC architecture, with circuit relay fallback, provides reliable enough connectivity for help request delivery — including critical (1-hour) severity.

The counter-evidence:

Budget Android devices ($50–150), common in this demographic, have documented WebRTC performance issues: inconsistent NAT traversal, limited RAM (2–3GB) causing browser tab eviction, aggressive battery management killing background processes, and weak CPUs struggling with DTLS/SRTP encryption.

The circuit relay is a correct engineering fallback, but relay bandwidth is finite. In a scenario with 100 active users in a geographic area, relay congestion could delay message delivery. A critical help request (1-hour expiry, food/medical/safety) that doesn't deliver in time is not a UX failure — it is a safety failure.

"Offline-first" via Yjs CRDT means: messages are queued locally and sync when connectivity is restored. This is excellent for routine help requests. For critical requests, the user may need to know now whether anyone saw their message. Yjs CRDT cannot confirm delivery — it can only confirm eventual sync. There is no delivery acknowledgment mechanism documented.

No device testing on actual target hardware (budget Android, 2G/3G connectivity) is documented. All development has likely occurred on developer-grade hardware with reliable WiFi.

What would change this analysis: - Field testing on $50–150 Android devices with spotty connectivity - A documented decision on critical request delivery: is there a fallback notification (SMS? push?) for critical severity when P2P delivery is uncertain? - Latency benchmarks for WebRTC connection establishment on budget Android

The honest position: This is a testable, solvable engineering problem. The concern is that it hasn't been tested yet in the target hardware environment, and the failure mode (delayed critical help request) is a safety risk, not just a UX risk.

Challenge 5: Circle Facilitator Power Is Unguarded¶

The theory being challenged: Facilitators are trusted community leaders who create safe Circles. The model follows existing mutual aid network patterns.

The counter-evidence:

In communities with histories of coercive authority (criminal justice, rehabilitation programs, shelter systems), the facilitator role can be captured by predatory actors. A facilitator has the power to: invite members, remove members, moderate content, and in Phase 4.5+, the Guardian AI scam detector operates at the Circle level. A predatory facilitator can:

Create a Circle that appears to be a mutual aid group but is actually a recruitment funnel for a scam or predatory service
Remove members who question or warn others
Disable or bypass the Guardian AI if user-controlled (consent-first means the user/facilitator decides)

The Guardian AI's scam detector is designed to protect users from outside threats. It doesn't protect against the facilitator being the threat. The "consent-first" architecture, which is correct for privacy, creates a blind spot here: if a facilitator has users' trust, they also have users' consent to run the app.

There is no documented mechanism for Circle members to report a facilitator, escalate to BNI, or exit a Circle without the facilitator knowing. There is no audit trail accessible to participants outside the Circle.

What would change this analysis: - A documented "Circle safety" feature: member-initiated reports on facilitators, visible to BNI (not the facilitator) - An explicit design decision: does BNI have any moderation role, and if so, how does that interact with the "no central server reading content" promise? - Guidance in onboarding for how to recognize a predatory Circle

The honest position: This is a design question, not a hypothesis. It needs a decision. The privacy architecture creates a legitimate tradeoff with safety from internal threats. That tradeoff should be documented explicitly.

Challenge 6: Identity Recovery After Device Loss Is Undocumented¶

The theory being challenged: did:key identity — a keypair in expo-secure-store — provides sovereign, privacy-preserving identity with no central server.

The counter-evidence:

Device loss, theft, and confiscation are not edge cases for this user population — they are common events. Formerly incarcerated users face device confiscation during arrest, parole searches, and incarceration intake. People experiencing houselessness face high rates of theft.

With did:key identity, losing your device means losing: - All Circle memberships - All message history (encrypted locally, unreachable without the key) - All MPWR balance and accumulated stakes (in Phase A, ledger is BNI-managed — is the balance tied to DID? Can BNI restore it?) - All help request history - All contact relationships (QR codes exchanged, peer DIDs)

There is no documented recovery mechanism. Did Phase 2's did:peer upgrade address this? There is no mention of backup, seed phrase, or recovery in any vault doc or CLAUDE.md.

For a user who earns $200 in MPWR over six months and then loses their device, the question is: is that $200 gone? If yes, MPWR creates a catastrophic single point of failure for people with the least ability to absorb it.

What would change this analysis: - A documented identity recovery mechanism (seed phrase backup, social recovery, BNI-custodied Phase A recovery) - In Phase A, if BNI manages the MPWR ledger, BNI can link balances to DID and restore them. Is this the plan? - A Phase B on-chain identity that persists across device changes

The honest position: This is a critical missing spec. It's not a hypothesis — it's a gap. It should be resolved in Phase 2 or Phase 3, before Phase 3.5 creates financial stakes tied to the same vulnerable identity.

Challenge 7: MPWR Value May Be Illusory in Year 1¶

The theory being challenged: MPWR is "a financial instrument backed by real revenue." Earning MPWR creates real economic opportunity.

The counter-evidence:

MPWR/USD rate = Backing Pool ÷ circulating supply.

In Year 1: - BNI LLC has no revenue (pre-launch) - Sponsor contributions are speculative - If 500 users earn an average of 20 MPWR/month, circulating supply after 6 months = 60,000 MPWR - If the backing pool is $0 (no revenue yet), MPWR = $0.00

More realistic: BNI secures a $10,000 sponsor contribution for the backing pool at launch. - 60,000 MPWR circulating - MPWR rate = $10,000 / 60,000 = $0.17/MPWR - Facilitating a Circle (5 MPWR) earns $0.85/month

This is not financial empowerment. It is a token program with symbolic value.

For MPWR to pay a meaningful amount ($50–100/month for a highly active participant), the backing pool needs to be $300K–$600K with 500 users. That requires BNI LLC revenue or grants at a scale that does not yet exist.

What would change this analysis: - A published minimum viable backing pool target before Phase 3.5 launch - A decision: should Phase 3.5 be delayed until the backing pool is large enough to make MPWR meaningful? - OR: reframe MPWR as a community currency with future redeemability, not a current financial instrument — and update all language accordingly

The honest position: The marketing language ("financial instrument," "lasting economic opportunity," "real-backed tokens") creates expectations that Year 1 math cannot meet. This is a communications and design decision, not just a financial one.

Summary: What This Red Team Asks of the Product¶

Challenge	Severity	Solvable Now?	Required Action
Motivation crowding-out	Medium	Partially	Define minimum meaningful MPWR value before launch; test with small cohort
SSI benefits harm	Critical	No — needs human partner	Engage WIPA partner before Phase 3.5 development begins, not just before launch
DeFi financial risk	High	Partially	Hard decision on whether DeFi yield is appropriate for Year 1; audit before any real funds
P2P reliability for crisis	Medium	Yes — testable	Field test on budget Android with 2G; define delivery guarantee for critical requests
Facilitator predation	High	Partially	Design decision needed: does BNI have a moderation role? Document the tradeoff explicitly
Identity recovery	Critical	Partially (Phase A)	Spec recovery mechanism before Phase 3.5 creates financial stakes on this identity
Year 1 MPWR value illusory	Medium	Yes — design decision	Set minimum pool threshold; update language to match actual Year 1 economics

None of these challenges say "abandon the mission."

The problems are real. The theory of change — that trusted community networks can support vulnerable people through hard transitions, and that economic incentives can make that sustainable — is worth testing. The red team's job is not to kill the idea. It is to make sure the idea is tested honestly, that the people it serves are protected from the ways it could harm them, and that the claims made in documentation and grant applications match what is actually validated.

Research Findings — Red Team Agent Run 1 (2026-05-08)¶

Status: Agent-generated. Pending human review before any finding is treated as authoritative or cited in external materials.

Method: Web search across academic literature (PubMed, Nature Human Behaviour, NBER, SSRN, IZA), government sources (SSA, CDSE, Georgetown CSET), investigative journalism (Sifted, Jacobin), and primary technical documentation. Each finding addresses one of five targeted questions not fully resolved by the original red team challenges above.

Finding 1: Tokenizing Peer Mutual Aid — The Literature Is More Damaging Than the Existing Challenge States¶

What the literature actually says:

The existing red team challenge cites Gneezy & Rustichini and Titmuss. The deeper literature sharpens this considerably.

Bénabou & Tirole (2006, American Economic Review, "Incentives and Prosocial Behavior") establish a formal model in which small incentives signal that a task is undesirable or that the requester doubts the agent's intrinsic motivation — both of which erode the social meaning of the act. This is not just about payment magnitude. The act of pricing sends a signal, independent of the price level. In a mutual aid context, the moment a token ledger records "you helped your neighbor: 3 MPWR," the implicit message is that helping required an incentive. This reframes the act for the participant.

A 2022 Nature Human Behaviour large-scale natural experiment (20,370 observations, ~27 million individual decisions, environmental domain) found that the aggregate supply of prosocial behavior is S-shaped in response to incentives: at low incentive levels, prosocial behavior decreases before the incentive is large enough to reverse the trend. The trough — where incentives actively suppress prosocial behavior — is precisely where MPWR will live in Year 1 given the math in Challenge 7 above.

A 2023 Journal of Behavioral and Experimental Economics paper (Asulin, Heller, Munichor, SSRN 4375009) found that non-monetary incentives (recognition, status signals) outperform monetary incentives for prosocial behavior in community settings because they reinforce rather than replace social identity. MPWR is a monetary instrument. It does not provide the non-monetary channel that appears most effective in community settings.

A key moderating variable across the literature is whether the incentive is perceived as "controlling" (externally imposing a value on a behavior) vs. "supportive" (affirming the person's existing motivation). A token system that quantifies and prices acts of care is structurally controlling, regardless of design intent.

Critically, no peer-reviewed study has tested a token economy applied to non-clinical peer mutual aid in communities experiencing recovery, reentry, or houselessness. The existing research base cited in MPowerUP Supporting Research.md is predominantly from contingency management for clinical addiction treatment (structured clinical environments with professional oversight, verified behavior change as the target, and therapeutic framing). Applying that evidence base to unstructured community mutual aid is a category error. The populations overlap; the settings do not.

How this deepens the existing challenge:

Challenge 1 in the existing red team doc says "no peer-reviewed study validates this application. This is a hypothesis." That is accurate but understates the problem. The available evidence does not merely fail to validate MPWR's theory — it provides a mechanistic explanation for why the theory is likely wrong at early-stage token values. The S-shaped response curve predicts active harm to mutual aid participation before the pool is large enough to reverse it.

What it means for MPowerUP specifically:

There is a design decision that has not been made: should tokens be invisible to participants until the backing pool reaches a minimum meaningful threshold? The existing token economy doc mentions this as an option but has not resolved it. The literature says this is not an optimization question — it is the central question. Launching MPWR at sub-meaningful values likely causes measurable reduction in the mutual aid behavior the product depends on. This is not a hypothesis about what might happen. It is the null prediction from the strongest available evidence on prosocial incentives in field settings.

Required action: Define "minimum meaningful" as an empirical threshold based on the Gneezy & Rustichini "pay enough or don't pay at all" criterion before any participant sees a token value. The product should be designed so that MPWR is completely invisible — no balances shown, no token language used — until the backing pool can support a rate that makes the liquid 50% of active participation worth at least $20–$30/month for a participant helping 3–5 requests/week. Below that, the token economy is not neutral; it is actively destructive.

Finding 2: Comparable Programs — The Failure Modes Are Documented¶

What the evidence actually says:

Samaritan app (formerly GiveSafe): Operated in Seattle. The app provided unhoused people with Bluetooth beacons that allowed passersby to donate via smartphone. Donations were restricted to approved merchant categories (food, hygiene, clothing, prescriptions) and participants were required to check in monthly with a nonprofit counselor or have their beacon deactivated. Critique from Jacobin (2019) and subsequent reporting: the app designed paternalistic spending controls into the architecture, treating the recipient as someone whose financial decisions require external gatekeeping as a condition of receiving aid. The beacon-check-in requirement created a coercive dependency between participation and continued access to earned value. This is directly analogous to MPWR's redemption architecture, where BNI controls the backing pool and the ledger. If BNI LLC fails or withdraws, participants' accumulated balances become inaccessible regardless of what they "earned." The Samaritan model shows that apps in this space tend toward paternalistic controls even when designers intend otherwise, because the app's liability architecture creates pressure to restrict what people do with money earned through the platform.

Beam (UK, homelessness social enterprise): Beam contracted with 55+ UK local councils to find housing and employment for referred homeless individuals. An investigation by Sifted (published 2024) found that of 20 councils providing data on £1.1 million in contracts: Conwy Council stopped its contract after one successful outcome in a year; Swale, Kent had two people housed on an £80K annual contract; Chichester, West Sussex had one person housed on a £47.5K annual contract; Ashford had three people still housed out of 30 contracted families after 105 one-on-one meetings. Multiple councils opted not to renew. The failure mode: tech-mediated intermediation of social services creates measurement gaps between reported outcomes and actual durable housing/employment. Beam's claimed outcomes relied on counting individuals who passed through the program, not those who achieved stable long-term outcomes. This is a cautionary pattern for any MPWR impact claim that counts "help requests resolved" as a validated outcome.

GoodDollar (crypto UBI for global poverty, 2020–present): Launched as a Web3 UBI experiment. Claimed 700K+ members, 100K monthly active users by early 2024. Three documented failure modes directly relevant to MPWR:

Smart contract exploit, December 17, 2023: The GoodDollar Reserve smart contract was exploited due to an untrusted input vulnerability, resulting in unauthorized withdrawal of 627,328.47 cDAI and unapproved minting of 14 billion G$ tokens (more than doubling circulating supply from 6 billion pre-exploit). Approximately 1 billion G$ were liquidated through DEXes before the DAO could respond, destroying value for all legitimate holders. An unaudited or inadequately audited vault holding community mutual aid tokens is not a tail risk — it is a precedent with documented victims.
Token liquidity and real-world exchange: Ethnographic research in Latin America (MDPI Blockchain: Research and Applications, 2025) found that GoodDollar's UBI tokens functioned as speculative instruments rather than means of exchange in most markets. Without institutional embedding — merchant acceptance networks, consumer protections, redress mechanisms — the income was nominal. Only where community partners negotiated merchant acceptance and provided financial literacy training did the program display welfare effects. MPWR faces the same gap: prepaid debit card channels and ACH transfers require BaaS infrastructure (Stripe Issuing, Sila) that is not yet established. The token has no value pathway until that infrastructure exists.
Sybil gaming at scale: GoodDollar's Sybil resistance relies on facial verification. MPWR relies on action signing with did:key. Both are gameable; the question is cost. For MPWR, gaming requires a second device with a different DID and a willing co-conspirator. In communities with high levels of distrust, coercion, or economic desperation, this cost is lower than the design assumes.

The fintech Beam (San Francisco, 2021 FTC shutdown): An entirely separate entity — a high-yield savings app called Beam was shut down by the FTC after an investigation found it misled users about access to funds and interest rates, preventing people from withdrawing their money. While not a mutual aid app, this case establishes a direct legal risk template: if MPowerUP's redemption infrastructure fails (BaaS partner issues, BNI LLC cash flow, technical fault), the FTC's consumer protection enforcement posture toward apps that hold and restrict access to money belonging to vulnerable people is aggressive. The vesting lock on individual stakes (6 months minimum) means BNI is holding money belonging to participants for a period during which the redemption infrastructure could fail. This is not theoretical risk — the FTC Beam enforcement happened.

How this deepens the existing challenges:

The existing red team doc identifies DeFi risk, facilitator predation, and SSI harm as high/critical. These comparisons add a fourth failure mode not yet named: platform architecture creates paternalism by default, not by choice. When you hold value on behalf of vulnerable people and control the redemption infrastructure, you become a gatekeeper regardless of mission. Every app in this space either becomes paternalistic (Samaritan) or collapses without achieving its mission at scale (Beam UK) or suffers financial exploits affecting the most vulnerable participants (GoodDollar). MPowerUP has not documented what structural decisions prevent it from following the same path.

What it means for MPowerUP specifically:

Three specific design questions that have no current answer:

If BNI LLC cannot fund the backing pool for two consecutive quarters, what happens to participant balances? There is no documented answer.
If BaaS infrastructure (Stripe Issuing) fails or withdraws service, what is the redemption path? There is no documented answer.
The GoodDollar exploit demonstrates that even a claimed-to-be-audited smart contract vault at Phase B will carry exploit risk. The existing red team doc says "unaudited vault is an ethics risk." The GoodDollar precedent makes this concrete: the ethics risk is: your community loses 627K+ in base currency and watches tokens become worthless in hours. An insurance line item (Nexus Mutual, InsurACE) must be a hard prerequisite for Phase B, not a recommendation.

Finding 3: SSI Earned Income — The Behavioral Evidence Is Worse Than the Existing Challenge Acknowledges¶

What the literature actually says:

The existing red team doc identifies SSI benefit reduction as a risk and suggests in-app disclosure plus WIPA partnership as mitigations. The behavioral evidence says this framing is inadequate in a specific, documented way.

The SSI earned income disregard ($85/month general exclusion, then 50% benefit reduction on earnings above that) has not been updated since the program was enacted in 1972. Due to inflation, the disregard has lost nearly all of its real value over 50+ years (Niskanen Center, 2021). This means the work disincentive embedded in the SSI structure has become progressively more severe over time, not less — and no legislative reform has corrected it.

The SSA's own research (SSB Vol. 65 No. 3) using administrative file data finds "stronger evidence that the SSI program creates labor supply disincentives" than prior studies had established. Approximately 30–40% of SSDI applicants would work if not for program disincentives — a finding that likely applies directionally to SSI recipients as well. The behavioral response is documented: people near the disregard threshold modulate income to avoid losing benefits.

The New York WORKS demonstration (SSB Vol. 66 No. 2) tested a reduced offset (25 cents per dollar instead of 50 cents) and found increased work participation when the effective marginal tax rate fell. The control and treatment group comparison directly establishes that the existing 50% offset is a significant behavioral deterrent — meaning SSI recipients are actively monitoring their income relative to the threshold and making decisions accordingly.

EITC research (Behavioral Responses to Taxes: Lessons from the EITC and Labor Supply, NBER) is relevant by analogy: labor supply responses to benefit cliffs concentrate at the extensive margin (entry/exit) and at kink points in the schedule. Workers do cluster their earnings near benefit thresholds, not randomly — this is the empirical signature of behavioral response to benefit cliffs. MPWR creates a new kink point at $85/month, the same threshold where SSI already creates documented behavioral clustering.

The unaddressed compounding risk: MPWR token earnings are ordinary income (IRS Notice 2014-21). They are not earned wages in the traditional sense, but IRS treatment as income at FMV on date of receipt means they count as countable income for SSI. The existing doc notes this. What the existing doc does not note: the timing problem is structurally worse for tokens than for wages. Wages arrive in regular paychecks and can be anticipated. Token accumulation in MPWR is episodic and variable — a participant might earn 0 MPWR in week 1 and 40 MPWR in week 2 if a critical help request appears. Variable income creates variable SSI reduction, which is harder to plan around than steady-state wages. The cognitive load of managing episodic token earnings against a monthly SSI cliff is not a "disclosure" problem. It is a design problem.

What it means for MPowerUP specifically:

The $85/month disregard threshold should be treated as a hard cap on Phase A liquid redemptions, not as a threshold that triggers a "warning." The warning model assumes rational forward-looking behavior under cognitive load, which the behavioral economics literature directly contradicts for this population. Until WIPA partnership is operational and pilot participants have received benefits counseling, no SSI recipient should be permitted to redeem more than $84 of liquid MPWR per calendar month. This is not paternalism — it is harm prevention pending the establishment of expert counseling infrastructure. Pilots without this cap, however small, expose participants to the SSI cliff with no backstop.

This is a design decision, not a recommendation. It requires a decision.

Finding 4: P2P WebRTC on Budget Android — The Failure Rate Is Known and It Is Not Small¶

What the literature and technical documentation actually say:

The existing red team challenge notes this is an untested engineering risk. There is now documented production evidence that narrows the uncertainty — in a direction that is not favorable.

The first large-scale measurement study of libp2p DCUtR (Direct Connection Upgrade through Relay) NAT traversal in the production IPFS network (arxiv.org/abs/2510.27500, published October 2025, covering 4.4 million traversal attempts from 85,000+ networks across 167 countries) found: the conditional success rate for NAT hole-punching is 70% ±7.1%, given that relay reservation and public address discovery have already succeeded. The overall end-to-end success rate including prerequisite steps is lower.

This means that in production, approximately 30% of direct P2P connection attempts in libp2p fail at the hole-punching stage alone, falling back to relay. This is on average-quality networks. Budget Android devices on 2G/3G networks, operated by organizations like carriers serving low-income US markets (TracFone, Consumer Cellular, Cricket), operate behind carrier-grade NAT (CGNAT), which is more restrictive than home NAT and has lower hole-punching success rates. CGNAT-specific hole-punching failure has no published figure in the libp2p measurement literature, but it is documented by network engineers to be substantially worse than home NAT.

Android-specific compounding factors documented by Android engineers and developers:

Doze mode: Android defers background network access when the device is idle. Apps not using FCM (Firebase Cloud Messaging) as their notification channel have network access blocked during Doze windows. MPowerUP's P2P architecture, if it relies on maintaining a libp2p listening socket in the background, is directly targeted by Doze. A foreground service (persistent notification) can exempt the app, but this is an explicit UX trade-off — persistent notification icons — that budget Android users frequently disable.
OEM-specific battery killers: Samsung One UI and Xiaomi MIUI (dominant in the sub-$150 Android market) add manufacturer-specific battery management layers that are more aggressive than stock Android Doze, actively killing background connections and processes. These are not edge cases — they are the dominant Android experience for the target demographic.
Relay bandwidth limits: libp2p Circuit v2 protocol limits relay connections by number, duration, and data volume per relayed peer. Under relay load from multiple users in a geographic area, these limits create message delivery failures that are silent — the sending user sees no error, and the receiving user never gets the message.

The critical gap: The 70% hole-punch success rate is a system-level average across 85,000 networks globally, weighted toward well-connected nodes. It is not a figure for CGNAT-constrained budget Android devices on 2G/3G. That figure does not exist in the published literature. This means MPowerUP's P2P reliability in the actual target environment is genuinely unknown, and the best available proxy (70% conditional success in production libp2p) is an optimistic upper bound, not a realistic estimate.

A 30% message delivery failure rate on routine help requests is acceptable for a peer networking experiment. It is not acceptable for a critical (1-hour expiry) safety help request from a person in housing or medical crisis.

What it means for MPowerUP specifically:

Two specific decisions that are required before Phase 2 or 3 (not Phase 4.5):

Critical severity help requests must have an alternative delivery channel that does not depend on P2P hole-punching. SMS fallback or FCM push notification for critical requests is not technically complex — it is a design decision that has not been made. The decision must be made before the product has users for whom a critical request means "I need food tonight" or "I need a place to sleep."
Hardware testing on a $50–100 unlocked Android device (Samsung A-series or Xiaomi Redmi, purchased from a prepaid carrier) running MPowerUP in background with an active Doze restriction is a prerequisite before any public beta. It is not expensive. The failure to do it is a choice, not a resource constraint.

Finding 5: Guardian AI Scam Detection — The Safety Paradox Is Empirically Documented¶

What the literature actually says:

The existing red team challenge (Challenge 5) notes that the Guardian AI scam detector won't protect users if the facilitator is the threat. The deeper problem is that it may actively make users more vulnerable to insider threats by creating false confidence.

This phenomenon has a name and a documented evidence base: automation bias. Georgetown CSET (November 2024 issue brief, "AI Safety and Automation Bias") synthesizes the literature: automated safety systems lead users to disengage from active monitoring, "heuristically replacing vigilant information seeking and processing." High system reliability increases — not decreases — this effect, because users who have seen the system work correctly repeatedly become more trusting.

The mechanism for MPowerUP is specific: a user who has seen the Guardian AI correctly flag a suspicious external message will, over time, develop a heuristic: "if Guardian AI didn't flag it, it's safe." This heuristic is entirely rational given their experience. It is also exactly wrong in the case of a predatory facilitator, who operates inside the Circle and whose messages are not structurally distinguishable from legitimate facilitator communications.

The insider vs. outsider threat detection gap is confirmed by enterprise security research: systems that achieve near-complete detection of external threats have significantly higher false negative rates for insider threats (CDSE, "Artificial Intelligence and the Insider Threat," 2024). The reason is mechanical: insider threat detection requires behavioral baseline deviation, while external threat detection can use content and source signals. A facilitator who slowly builds trust and then exploits it does not deviate from their baseline — they build a new baseline before striking.

Critically, the consent-first architecture amplifies this problem rather than mitigating it. Because Guardian AI is opt-in and user-controlled, a predatory facilitator who wants to neutralize it need only persuade users not to enable it, or to trust them specifically despite its warnings. The "consent-first" framing — which is correct for privacy — is a structural liability against trusted-insider threats. The user who most needs protection is the one who has given their trust to someone who is exploiting it. That user will not invoke the Guardian AI against their trusted facilitator.

Published evidence on consumer safety apps specifically (NewsBreak, 2024; security literature): "The most dangerous thing [safety apps] offer is a false sense of security — a belief that help is always a tap away, when in reality that's not guaranteed. Blind trust in technology can lead to risky behavior and poor decisions in critical moments."

This is not a hypothetical failure mode. It is the documented behavioral consequence of deploying automated safety tools in contexts where users cannot independently verify the tool's limitations.

What it means for MPowerUP specifically:

The Guardian AI doc lists under Known Unknowns: "What happens when the Guardian AI flags a message from a Circle facilitator? Does the user trust the agent or the facilitator? No design decision exists."

This is the correct question. But the evidence above says the answer is almost certainly: the user trusts the facilitator. And after months of the Guardian AI correctly flagging external threats without ever flagging the facilitator, the user's confidence in the Guardian AI's silence-as-safety-signal is higher than if the Guardian AI didn't exist.

Three design decisions required before Phase 4.5:

The Guardian AI must not present silence (no flag) as implicit safety clearance. The UX must actively communicate what the Guardian AI does not watch for, including trusted-insider behavior.
The Circle member-initiated facilitator report mechanism (identified as missing in Challenge 5) is not optional — it is the primary safety mechanism for insider threats. The Guardian AI cannot substitute for it.
The Guardian AI's limitations must be documented in onboarding at an equal level of prominence to its capabilities. Presenting only capabilities in onboarding is how safety tools create the behavioral overconfidence that precedes harm.

Cross-Finding: The Compounding Problem¶

None of these five findings is independently fatal to the MPowerUP thesis. Together, they describe a compounding risk that is not yet documented anywhere in the vault:

The target user population — SSI recipients in recovery, reentry, or houselessness — is characterized by elevated cognitive load, present-bias decision-making (documented in the poverty and cognitive bandwidth literature), and high trust in programs that show up as helpful. These are not character flaws; they are documented behavioral consequences of chronic scarcity and institutional trauma.

MPWR is designed to operate in this cognitive environment. But the specific risks documented above — SSI cliff harm from episodic token income, Guardian AI false confidence from automation bias, facilitator predation through trusted-insider channels, P2P delivery failure on the specific hardware this population uses, and the token pricing signal eroding the mutual aid it's supposed to reward — all exploit the same vulnerability: they activate when the user is most stressed, most trusting, and least able to recognize or recover from a harm.

The existing red team doc ends with: "None of these challenges say 'abandon the mission.'" That remains accurate. But the compounding risk profile described above — five mechanisms that each concentrate harm on the most cognitively loaded, most trusting users at exactly the moments of highest vulnerability — is a design question that has not been asked in the vault docs yet.

The question is: what does MPowerUP look like if it is designed specifically to fail gracefully under cognitive overload, rather than assuming users will engage protections correctly under stress? Answering that question before Phase 3.5 would change the product significantly.

Agent-generated findings, 2026-05-08. Primary sources cited inline. All findings are [HYPOTHESIS] or [EMPIRICALLY VALIDATED] as marked per the epistemic honesty standing directive. No finding here has been reviewed by a domain expert (WIPA counselor, securities attorney, clinical psychologist, mobile network engineer). Human review is required before any finding is cited in a grant application, product specification, or external communication.

Red Team: MPowerUP — Adversarial Analysis¶

Challenge 1: Tokenizing Mutual Aid May Destroy It¶

Challenge 2: MPWR Cash Income May Actively Harm Participants¶

Challenge 3: DeFi Staking Is Not a Safe Financial Instrument for Vulnerable People's Money¶

Challenge 4: P2P WebRTC May Not Be Reliable Enough for Crisis Use Cases¶

Challenge 5: Circle Facilitator Power Is Unguarded¶

Challenge 6: Identity Recovery After Device Loss Is Undocumented¶

Challenge 7: MPWR Value May Be Illusory in Year 1¶

Summary: What This Red Team Asks of the Product¶

Related Pages¶

Research Findings — Red Team Agent Run 1 (2026-05-08)¶

Finding 1: Tokenizing Peer Mutual Aid — The Literature Is More Damaging Than the Existing Challenge States¶

Finding 2: Comparable Programs — The Failure Modes Are Documented¶

Finding 3: SSI Earned Income — The Behavioral Evidence Is Worse Than the Existing Challenge Acknowledges¶

Finding 4: P2P WebRTC on Budget Android — The Failure Rate Is Known and It Is Not Small¶

Finding 5: Guardian AI Scam Detection — The Safety Paradox Is Empirically Documented¶

Cross-Finding: The Compounding Problem¶