YEAR
2020
TYPE
Augmented AI
Conversation Design
Information Architecture
Product Management
Designing Trustworthy AI in a Regulated Service Environment
I led the experience design for PRUChat, Prudential’s customer-facing AI assistant, during a period of rising service demand, COVID pressure, and strict insurance compliance. The challenge was not to make AI feel human, it was to make it reliable, useful, and safe enough for a regulated environment.
WHY THIS STILL MATTERS IN 2026
The model changed, but the trust problem hasn't
PRUChat was an early lesson in designing around AI limitations in a high-risk domain. The hybrid architecture, combining NLP with deterministic compliance flows and built pre-ChatGPT/Claude era, turned out to anticipate a lot of what regulated AI deployment now requires.
IMPACT AT A GLANCE
BUSINESS TARGET
25%
Call Volume Reduction
ACTUAL PERFORMANCE
32%
30%
Email Inquiry Reduction
28%
<5 mins
Average Response Time
4.2 mins
42%
Successful Self-Service Rate
52%
35%
Agent Escalation Rate
31%
*Note: These metrics reflect 2020 AI capabilities
As design lead, I orchestrated a cross-functional team of 12+ members across business, product, engineering, and automation teams.
Established shared decision frameworks that enabled non-technical stakeholders to make informed AI product decisions, resulting in 3x faster alignment on compliance requirements.
Shipped working AI product before the current AI wave, gaining unique experience in how NLPs and pre-scripted bots work and how people use it.
CONTEXT & PROBLEM
Customers needed faster answers outside business hours, while customer service agents were handling rising call and email volume. In insurance, speed alone was not enough. The assistant had to reduce operational load without creating compliance, privacy, or trust risks.
FRAMEWORK
We made a deliberate architectural decision to keep humans in the loop at consequential decision points because of regulatory requirements.
We used NLP to help interpret how customers described their problems, while scripted flows handled regulated or predictable resolutions. PRUChat is designed as an augmented AI system: deliberately keeping customer service agents and customers in the decision seat, while using AI to compress the time, effort, and cognitive load required to get there.
When confidence was low or the topic was sensitive, the system escalated instead of guessing.
Confidence-based routing
Designed high, medium, and low-confidence states so the assistant could answer, clarify, or escalate.
Compliance-first guardrails
Removed high-risk features like personal data and document attachments when privacy and security risk outweighed customer value.
Cost-aware conversation patterns
Used quick replies, concise prompts, summaries, and expandable details to reduce unnecessary conversation length and improve clarity.
Approach
Customer and Financial Consultant Surveys and Feedback Analytics
Competitive Research
and Benchmarking
Decision Trees and Chat Flow Generation
Evaluation and Testing
Bot Training, Intent Classification
and NLP Planning
Interaction and Conversation Design
DISCOVERY & PLANNING
Defining the MVP started with gathering internal and external data to frame the problem and understand what and who shaped it.
IDEATION & SYNTHESIS
Generating and Synthesizing Ideas to leverage Key Insights
4 cross-functional workshops turned siloed compliance and tech concerns into shared design principles, and buy-in within a month.
CONVERSATION FLOW DESIGN
The ideation stage yielded a foundation framework of our conversation ecosystem. In-depth bot and NLP training mixed with systems thinking was required to take the experience to the next level.
NLP VS PRE-SCRIPTED BOTS
this determines if NLP is worth the complexity
structured solutions favor pre-scripted responses
compliance and legal liability require exact question phrasing and response accuracy
complex trouble-shooting and personalized interactions (eg. financial advice)
HOW I WORKED WITH AI
When the AI wasn't sure, we didn't guess. A 3-tier confidence system routed users to answers, confirmation prompts, or a human depending on how clearly intent came through.
High confidence (↑75%): Direct answer (lower threshold due to model limitations)
Medium confidence (50-75%): Confirmation required
Low confidence (↓50%): Immediate human escalation (eg. Prudential hotline, customer service agent or financial consultant)
I audited customer service logs to identify top query types, mapped them into intent clusters, then designed end-to-end conversation flows in Excel covering decision branches, fallbacks, and agent escalation triggers.
PRECISION VS RECALL
We asked ourselves:
What's worse, missing something important or including something irrelevant?
Missing something important → Optimize for Recall (catch everything)
Including irrelevant stuff → Optimize for Precision (be selective)
We chose precision over recall: the AI only escalates when 90%+ confident a human is needed. This kept our customer service agents free for cases that actually required them, and made self-service flows more reliable.
SYNTAX MATTERS
Context bridging is critical. Customers want to see their original words reflected in structured flows, but pre-scripted flows for insurance was more appropriate because regulatory compliance requires collecting specific information in exact order.
Adding features to our AI increased conflict potential - the more capabilities we build, the higher the risk of the AI breaking due to overlapping intents. We needed to strategically balance comprehensiveness with accuracy.
Unclear syntax = unclear intent = confused AI
We pulled the top 20 customer questions from analytics to build our FAQ content matrix, prioritising Claims, PRUShield, COVID, and forms. That became the training baseline, refined continuously. Token-based pricing then shaped what came next.
REDUCING AVERAGE COST
Optimizing Conversation Design via Token-based Interaction Patterns
A token is roughly equivalent to a word or part of a word that the AI processes.
"Hello" = 1 token / 750 words = 1,000 tokens
Token-based pricing, you pay for every token the AI reads (input) AND writes (output).
Token-based interactions refer to how the AI's token processing affects the way users interact with the system - essentially how the "token reality" shapes the user experience. Unlike a normal app that just responds to clicks, AI systems and chat agents must "read" and "write" tokens for every word and interaction, which creates unique UX patterns.
LENGTH AFFECTS COST EXPONENTIALLY
Insurance claim description and analysis gets longer as you go
Longer paragraphs analysis (document length) costs more as the text grows. Limit document length.
OUTPUT LENGTH IS EXPENSIVE
The longer the AI's response, the more expensive it gets
Output tokens cost 2-5x more than input tokens. AI generating 500-word explanations for every question may not always be good design
CONTEXT MEMORY COSTS MONEY
Every message in a conversation counts as input tokens
Decide how much conversation history to keep vs starting fresh.
Attachment Feature Removal
Analytics showed low usage and real risk: uploaded files could expose personal data under PDPA*, and opened the door to malicious content. Easier to cut than to patch.
Character Limits for Token Efficiency
Shorter inputs meant less AI confusion, lower token costs, and better responses. A small constraint with outsized returns.
*Singapore Personal Data Protection Act
I designed interactions that work with token processing, not against it. Chunked information became a feature, not a constraint.
DESIGN PATTERNS
Chat Button Suggestions
Low-cost interaction that simplifies information and immediately lets users know available actions to perform.
Expandable Cards and 'Read More'
Expandable cards showing summary first → detail on demand. Less tokens, more control.

WHAT I WOULD EVOLVE TODAY
How I'd redesign this for the
AI-native era now
By pairing generative understanding with deterministic, rules-based logic, Agile and Lean UX frameworks serve as continuous validation loops—orchestrating real-time user testing, mitigating AI hallucinations, and ensuring strict compliance at every consequential branch.
Ground answers in approved knowledge sources using retrieval-based flows
Add source transparency and confidence signals for sensitive answers
Summarise conversations before human handoff
Track false confidence, escalation accuracy, and unresolved intent patterns
Create an AI governance model for what the assistant can answer, ask or escalate

The Shift in Plain Terms
IBM Watson needed thousands of labelled examples to recognise specific phrases. Slightly different wording? It breaks.
Copilot and Claude already understand language, you configure the rules, not the model.
Why it matters for PRUChat
The original PRUChat was capped by Watson's training data and model. A redesign today isn't a UI refresh, it's a fundamentally different capability ceiling. Deterministic where it must be, generative where it helps.
KEY TAKEAWAY
This project shows that I can design AI experiences beyond the interface layer. I can work with ambiguity, technical constraints, compliance risk, business pressure, and imperfect models — then turn that into a usable, measurable, and scalable service experience.
Our results show strong potential for integration across Prudential's digital ecosystem, with opportunities to automate and digitise operational tasks, support our human agents, and immediately provide our customers the answers they need.
WORK
FIND ME ON
GET IN TOUCH