01Amazon · Case Study
Working Backwards AI
An AI-powered product thinking assistant that challenges assumptions, surfaces insights, and helps PMs make better decisions early.
01 — Context
Preventing customer experience risks before launch.
The Customer Experience Risk Service (CXRS) team focuses on identifying and preventing customer experience risks before products launch. I joined as the design lead, and over four quarters took the product from exploration to production launch.
My impact
- Reframed the problem space
- Led qualitative research and AI UX exploration
- Translated insights into product direction
- Drove cross-functional alignment and execution
02 — The Problem
Customer issues were discovered only after launch.
Many customer issues surface only after launch — when complaints reach Customer Service. At Amazon's scale, even small product decisions can impact millions of customers.
Based on Amazon's 2024 operational metrics.
03 — Reframing the Challenge
How might we
Help Amazon product teams anticipate and prevent customer complaints before product launch?
Our hypothesis: AI can simulate customer reactions before launch.
04 — First Attempt
Piper: a customer service AI module.
The first concept, "Piper," was a Customer Service AI module — a pre-mortem tool that scanned product documents and flagged customer-service risks.
A quick feasibility test with an AI Science demo and a UX prototype surfaced a hard truth: users treated Piper as a downstream tool, but preventing customer issues requires upstream decisions. Risks were identified too late to influence product direction.
What we learned
The module surfaced risks — but at the wrong stage of the product lifecycle.
05 — The Pivot
From risk detection to early decision support.
Using PM jobs-to-be-done analysis, persona work, and storytelling, I made the case to leadership for repositioning the product — away from downstream optimization, toward helping PMs make customer-centered decisions earlier and influence product direction proactively.
Downstream optimization: flag customer-service risks after the idea is already locked.
Help PMs make customer-centered decisions earlier and shape product direction proactively.
06 — Research
Listening to how PMs actually work.
I analyzed recurring pain points from the PM Slack channel and ran a jobs-to-be-done ideation workshop, then validated the direction through user sessions. Decision making under uncertainty emerged as the main challenge for most PMs.
Key insight
PMs lack a reliable way to challenge early product ideas. As a result, gaps and flawed assumptions are often discovered too late.
07 — Proposal 1.0
Explicit multi-agent collaboration.
I partnered with PM to define the initial roadmap — a conversational AI interface, a writing coach, and specialized AI agents — prioritized by user impact, decision value, and engineering complexity. Proposal 1.0 made the multi-agent structure explicit, prioritizing clarity and trust: a clear mental model of expertise, transparent AI reasoning, reduced hallucination risk, and easier debugging and evaluation.
What early testing revealed
PMs didn't understand when or why to choose different agents.
Conversation and key insights were mixed together in one chat stream.
For the MVP I simplified the AI experience — simpler interaction patterns, guided onboarding, and a lower learning curve — and shifted from conversation to artifacts: a dual-panel workspace separating ideation from structured outputs, with automatic PRD drafts generated from AI conversations.
08 — Final Proposal
Three design shifts in Proposal 2.0.
Task-guided conversation
Prompt suggestions based on PM work stages lower cognitive load, keep the decision flow uninterrupted, and let the system guide the thinking.
Workspace canvas
A dedicated canvas separates thinking from chatting, makes progress and iteration tangible, and keeps decision-critical content stable and visible.
Inline customer & expert commentary
Different AI experts review the document and surface CX, tech, and customer insights directly on the artifact — multiple perspectives without fragmentation, interactive comment threads, and clear ownership and resolution.
09 — Tradeoffs & System Thinking
When engineering reality forces a UX tradeoff.
Mid-build, the architecture moved from an explicit multi-agent structure to a super-prompt orchestration model — compressing each expert's reasoning into a single response. Faster and cheaper, but it reduced transparency and user control over AI outputs. To recover trust, I introduced two guardrails:
A visible "thinking" state that manages expectations during generation.
Automatic snapshots let users safely experiment and roll back if the AI output goes off track.
10 — Scaling the System
An AI component sub-library.
As WBAI evolved into an AI-native experience, standard components were no longer sufficient. I initiated an AI-specific component sub-library to support scalable, consistent, and explainable interactions — including inline comments, threaded discussions, and AI states.
11 — Signals of Impact
Adoption, decision impact, and satisfaction.
Data as of January 27, 2026 — first six weeks after initial release.
12 — Reflections
If I were to do it again.
Reframing the core problem
The biggest impact came from reframing WBAI from a CS-surface tool into a decision-support platform. WBAI is no longer an experiment in AI-assisted writing — it's a foundation for scalable, decision-centered AI support across PM workflows.
Intelligence requires restraint
The stronger the AI becomes, the more structured the UX must be. Users don't need to see the complexity — they need clarity.
Improve transparency of AI outputs
I would invest more in features that help users understand and evaluate AI outputs, not just interact with them. Thinking indicators show the system is working — but users still need clarity on why the AI generated a particular suggestion.