Oppia serves under-resourced learners worldwide — many accessing lessons on 2G/3G connections, shared devices, or during brief windows between other responsibilities. Every loading delay, every lost progress state, and every navigation dead end carries a higher cost here than on a mainstream consumer platform. This report weights findings accordingly.
Oppia's content model and design language are genuinely good — the story-based metaphor, clean typography, and no-login browsing all work well. The platform is held back by a small number of high-impact infrastructure problems: progress never saves for any user, the homepage leads with a fundraising overlay, and the lesson player offers no visible exit. All three evaluation sources agree on these priorities — a focused sprint could move this to a solid B.
Authenticated user progress is not persisting for any account — confirmed independently by two AI audits and three of five human evaluators. Scaling Oppia's learner base before this is resolved will amplify drop-off proportionally. Resolve the persistence pipeline before any growth or outreach campaign.
The 0% progress ring never updates, for guests and logged-in users. This was independently confirmed by both AI audits (Claude H06: 68.75%; Codex H06: 50% — Catastrophic) and by 3 of 5 human evaluators — then escalated by the HIL auditor who personally verified it breaks for authenticated accounts too.
All three evaluation sources — Claude (Sonnet 4.6), Codex (GPT-5.5), and five subject-matter experts —
reached the same core conclusion: Oppia's content and design language are genuinely good,
but the platform's infrastructure undermines the learning experience it promises.
The story-based metaphor works. The no-login exploration model is smart. The visual design is clean.
But a learner who completes a lesson finds their progress ring still at zero. A new learner landing on the homepage
sees a donation modal before seeing a single sentence about what Oppia teaches. A learner who exits the lesson player
by accident finds no obvious way back in.
The highest-impact fixes — progress persistence, the donation modal, the loading screen — are engineering and
product-policy decisions, not design redesigns. They are fast to fix and carry outsized impact for Oppia's primary audience:
learners in under-resourced communities where every second of load time and every moment of lost progress matters.
The 3-source blended evaluation produces a grade of B- / 71.1% —
a high B-minus that reflects real strengths alongside high-impact, addressable gaps.
Getting to a solid B+ requires fixing only the top four issues.
| Sprint | Action |
|---|---|
| Immediate | Audit and fix authenticated user progress persistence pipeline (H06 — Critical, confirmed by all 3 sources) |
| Immediate | Add "Sign in to save progress" prompt at lesson start for guest users (H07) |
| Sprint 1 | Implement donation-modal frequency cap (max once / 30 days); persistent nav link instead (H08 — NEW from human panel) |
| Sprint 1 | Add skeleton screen / branded spinner to lesson player loading state (H01) |
| Sprint 1 | Add visible "← Back to [Topic]" button in lesson player header (H03 — confirmed by all 3 sources) |
| Sprint 1 | Fix Angular router to set meaningful page title on every navigation event (H01) |
| Sprint 2 | Add ARIA live region for SPA navigation announcements (H01/H11 — Codex severity-3) |
| Sprint 2 | Add "Continue where you left off" widget on homepage and classrooms (H07) |
| Sprint 2 | Add visible :active CSS state to all buttons and interactive elements (H08 — human consensus) |
| Sprint 2 | Standardize vocabulary: one term set across nav, headings, breadcrumbs, URLs (H04) |
| Sprint 3 | Fix H1 heading hierarchy across all classroom page templates (H11) |
| Sprint 3 | Persist cookie consent in localStorage; add decline / manage-preferences option (H08) |
| Sprint 3 | Run full axe-core + manual screen-reader audit — surface audit covers visible issues only (H11) |
These three dimensions extend beyond the Nielsen Norman 10+H11 framework. H13 (Customer Journey) was assessed by both Codex and the human panel; H12 and H14 are human-panel only. These scores do not alter the core blended grade.
AI-1 — Claude (Sonnet 4.6, Medium Reasoning): 83 checklist items rated via live Playwright browser session. Pages: Homepage, Math Classroom, Place Values Topic, Lesson Player. Agent modes: Generalist UX Researcher, Accessibility Specialist, Content Strategist. HIL calibration gates at mid-audit and final review.
AI-2 — Codex (GPT-5.5, Extra High Reasoning): Independent heuristic evaluation against the same 11 core heuristics + H13 Customer Journey. Report: oppia-uxhc-final-report.md (2026-05-02). 83 items, same 0–4 severity scale.
Human Evaluation Panel: 5 evaluators (Human 1, Human 2, Human 3, Human 5, Human 4) assessed the same interface using the same 0–4 severity scale. Source: HE_Unified_Scorecard_LOL_V2.xlsx, synthesized 2026-04-09. H12–H14 supplemental assessed by human panel only.
Blending method: Simple average of all available source scores per heuristic. H01–H11 use 3-way average (AI-1 + AI-2 + Human). H13 uses 2-way (AI-2 + Human). H12 and H14 are human-panel only. Supplemental heuristics not included in core blended grade.
H11 Accessibility: Surface audit only — not a full WCAG 2.1/2.2 compliance test. Both AI audits scored H11 at 50% (C-). A dedicated axe-core or manual screen-reader audit is strongly recommended.
H09 Error Recovery: Codex found no observable failures and scored 100%. Claude and human panel had limited evidence. Overall confidence on H09 is low — the high Codex score should not be taken as confirmation that error recovery is strong.
Out of scope: Authenticated post-completion flows, mobile views, error state messaging, post-login dashboard.