AI that finally feels human

An AI Companion that made 82% of users finally feel heard -- Not just giving generic advice

Empaithy

Human Truth

+44 NPS

Top 1% of all mental-health apps worldwide Users don’t just keep it — they actively recommend it to friends and family

Retention

36% at 90 days

+8% improvement from M1 to M3

A more Human A.I

An AI Companion that made 82% of users finally feel heard -- Not just giving generic advice

Human Truth

+44 NPS

Top 1% of all mental-health apps worldwide Users don’t just keep it — they actively recommend it to friends and family

Retention

+8% improvement from M1 to M3

36% at 90 days

Top 1% of all mental-health apps worldwide

Users don’t just keep it — they actively recommend it to friends and family.








AI that finally feels human

An AI Companion that made 82% of users finally feel heard -- Not just giving generic advice

Empaithy

Human Truth

+44 NPS

Top 1% of all mental-health apps worldwide Users don’t just keep it — they actively recommend it to friends and family

Third-Month Retention

36%

+8% improvement from M1 to M3

The Design Challenge


Build an AI companion that enforces CBT rules but actually feels human, never worsens distress, and turns fleeting downloads into life-long habits.





→ That’s why we created Empaithy.








Competitive Audit


We audited the top 5 AI mental-health companions (Wysa, Replika, Youper, Ash, Noah). The same three complaints appeared in >70 % of 1–5 star reviews:



3 months reading 40+ papers on LLM safety & emotional recognition

200+ user reviews scraped from Wysa, Replika, Youper

12 interviews with licensed therapists on clinical red lines.

Pre-Project Foundation

→ Full research deck available on request






Competitive Audit


We audited the top 5 AI mental-health companions (Wysa, Replika, Youper, Ash, Noah). The same three complaints appeared in >70 % of 1–5 star reviews:



3 months reading 40+ papers on LLM safety & emotional recognition

200+ user reviews scraped from Wysa, Replika, Youper

12 interviews with licensed therapists on clinical red lines.

Pre-Project Foundation

→ Full research deck available on request






Research

The Problem

When young people are struggling, they reach for their phone… …and most AI “therapists” make them feel worse.

Traditional therapy is too expensive, stigmatized, or unavailable

Existing mental-health chatbots feel robotic, generic, and trigger-word driven

When someone is already at –1 or –2 emotionally, a bad response doesn’t just fail — it pushes them to –3 or lower

When young people are struggling, they reach for their phone… …and most AI “therapists” make them feel worse.

Traditional therapy is too expensive, stigmatized, or unavailable

Existing mental-health chatbots feel robotic, generic, and trigger-word driven

When someone is already at –1 or –2 emotionally, a bad response doesn’t just fail — it pushes them to –3 or lower

The 73 % Drop-Off rate

73 % of users abandon existing AI companions within the first three messages.
 (Source: Confirmed by our own usability studies + aggregated public review data + session-length studies)

The Insight That Changed Everything

When someone opens a mental-health app, they’re almost never neutral — they’re already at –1 or –2. A generic or repetitive response doesn’t just fail; it actively harms, pushing them to –3 or lower.

This became our unbreakable rule: Every single interaction — from message one — has to feel deeply personal, context-aware, and genuinely caring.

Competitive Audit


We audited the top 5 AI mental-health companions (Wysa, Replika, Youper, Ash, Noah). The same three complaints appeared in >70 % of 1–5 star reviews:



3 months reading 40+ papers on LLM safety & emotional recognition

200+ user reviews scraped from Wysa, Replika, Youper

12 interviews with licensed therapists on clinical red lines.

Pre-Project Foundation

→ Full research deck available on request






Audit: User Pain Points
01
Mood Degradation
“I was upset and left feeling more alone.”
02
Keyword Logic
“Feels like it’s reacting to keywords instead of hearing me.”
03
Repetitive Scripting
“It just recites the same motivational quotes over and over.”

Design

6 weeks turning raw LLM chaos into a clinically sound, deeply human companion

Every decision started with the same question: “How do we create an LLM to be safe, non-repetitive, and actually helpful when someone is at –2? (already emotionally negative)

A core 2×2 framework used to classify every user state by Distress Level × Intervention Urgency.



Rule 1 (Acute Risk): If High Risk is detected, a Crisis Interruption is activated, bypassing all journaling prompts.

Rule 2 (Chronic Risk): The AI must never validate hopelessness without a neutral pivot or resource suggestion immediately following.

Technical Handoff: Prompt Engineering mandates prioritizing the Validation Token before the Pivot Token for sequence control.






Emotional Compounding Risk Matrix

Result: 100 % safety compliance across 200+ red-team turns.

A non-gendered, non-human persona engineered to establish a clear "Non-Human Contract," ensuring safe user boundaries and distance.




Rule 1 (Voice & Tone): Warm, validating, non-diagnostic language; prioritizes listening over solutions. (Mitigates Emotional Over-Correction Risk)


Rule 2 (Empathy Framework): Layered Empathy acknowledges emotion before offering low-friction prompts. (Mitigates Emotional Escalation Risk)

Rule 3 (Stance on Advice): No direct, unsolicited advice. Always framed as a suggestion or exploration. (Mitigates Dependency)






The AI Persona: “Your Guide”



Result: 42 % higher perceived empathy than competitors in blind tests.

Adaptive system designed to regulate cognitive load. Optimizes engagement by modulating friction relative to real-time Distress Levels.




Flow 1 (Crisis Mode): Goal is immediate de-escalation & safety action. **Decision:** Crisis Interrupt Trigger; no forced self-reflection (Journaling Prompt Disabled).

Flow 2 (Low Mood): Goal is Micro-Journaling Input. **Decision:** Low-Effort Input First (Mood Slider or Quick-Reply) to lower the barrier to entry.

Flow 3 (High Engagement): Goal is deeper CBT/ACT activity. **Decision:** Adaptive Pacing. Only when stable sentiment is detected are multi-step activities introduced.




Conversational Flow: Adaptive CBT Loop



Result: 18.7 messages per session (vs 3.2 competitor avg) and 93 % retention.

These constraints formed the core user experience

Session Archive User Logs & Insights Input Entry User Expression "I'm feeling really anxious and stuck..." EMPAITHY Adaptive Logic Low - Moderate Stress Cognitive Reframe Identifying alternative perspectives... Critical Trigger Detected CBT Safety Protocol High Stress / Crisis Keywords Identified Escalating to human-verified resources PROCESSING

These constraints also directly shaped our Design System

These constraints also directly shaped our
Design System

These constraints also directly shaped our Design System

LLM Design Trade-off: Conversational Richness vs Emotional Safety

One of the hardest balances was between conversational richness and emotional safety. The LLM was capable of longer, more dynamic responses, but usability testing with users in negative emotional states showed that verbose or high-energy replies could feel overwhelming or invalidating. I made the call to constrain responses — using shorter validation-first replies, strict guardrails against clinical language, and deliberate micro-journaling prompts. While this made some interactions feel less “intelligent” on the surface, it dramatically reduced the risk of emotional compounding. This trade-off was validated through targeted testing with distressed users and contributed to stronger emotional impact scores and better retention.

One of the hardest balances was between conversational richness and emotional safety. The LLM was capable of longer, more dynamic responses, but usability testing with users in negative emotional states showed that verbose or high-energy replies could feel overwhelming or invalidating. I made the call to constrain responses — using shorter validation-first replies, strict guardrails against clinical language, and deliberate micro-journaling prompts. While this made some interactions feel less “intelligent” on the surface, it dramatically reduced the risk of emotional compounding. This trade-off was validated through targeted testing with distressed users and contributed to stronger emotional impact scores and better retention.

User State: Distressed (-2)
Risk of emotional compounding
It's been a really long day today
I hear you! Long days can be so incredibly exhausting, but you should be so proud of yourself for pushing through! Let's shake off that negative energy. What was the absolute best part of your day, or is there a passion project you want to jump into right now to feel inspired?
Safer, more supportive
It's been a really long day today
I'm sorry to hear that. Sometimes, long days can feel overwhelming. What's been on your mind, or is there something specific that's made today feel particularly exhausting for you?
Reflect Tap here to start micro-journaling

These six principals were set as the foundation for our AI and Visual Design

These six principals were set as the foundation for our
AI and Visual Design

These six principals were set as the foundation for our
AI and Visual Design

Low Cognitive Load

Uncluttered Interface, Ample White Space, Clear Visual Hierarchy and Soothing visuals.

Low Cognitive Load

Uncluttered Interface, Ample White Space, Clear Visual Hierarchy and Soothing visuals.

Natural Dialogue

Context-aware tone and avoiding Generic Responses

Natural Dialogue

Context-aware tone and avoiding Generic Responses

Low-Friction Entry

Voice Entry + Microjournaling and one tap access

Low-Friction Entry

Voice Entry + Microjournaling and one tap access

Adaptive Personalization

Memory-driven responses with Mood-specific interactions and animations

Adaptive Personalization

Memory-driven responses with Mood-specific interactions and animations

Trust & Confidentiality

Clear Privacy Indicators, security badges, and conveying a safe, private space for vulnerability.

Trust & Confidentiality

Clear Privacy Indicators, security badges, and conveying a safe, private space for vulnerability.

Actionable Insights

Simple Data Visualization, mood tracking, and providing suggested implications or next steps, not just raw data.

Actionable Insights

Simple Data Visualization, mood tracking, and providing suggested implications or next steps, not just raw data.

Design System Playground

Low-Friction Onboarding

The First Step to Trust: The first 30 seconds decides whether a user stays or leaves.

The First Step to Trust: The first 30 seconds decides whether a user stays or leaves.

A common drop-off point in therapeutic apps is the onboarding process. To meet the Low-Friction Entry requirement and reduce Customer Acquisition Cost (CAC), we designed a flow that prioritized speed and minimal cognitive load in 4 screens.

A common drop-off point in therapeutic apps is the onboarding process. To meet the Low-Friction Entry requirement and reduce Customer Acquisition Cost (CAC), we designed a flow that prioritized speed and minimal cognitive load in 4 screens.

Conversational Flow: Empathy to Action

Non-directive, with deliberate pauses to build trust and not escalate emotional state

A.I builds trust by validating a question/response first

Smooth transition to structured CBT activities that drive therapeutic progress, not a passive chat

Enables quick, low-effort logging during high-emotion moments, preventing frustration from escalating

Achieving 100% Functional Parity: LLM Logic Meets Soothing UX

Achieving 100% Functional Parity: LLM Logic Meets Soothing UX

A gentle, validating AI response that meets users exactly where they are — no judgment, just support

Journaling as therapy and not a chore. Here, we embedded the activity directly into the AI flow and structured it to directly support the CBT goal of Cognitive Restructuring without ever feeling like homework.

Journaling as therapy and not a chore. Here, we embedded the activity directly into the AI flow and structured it to directly support the CBT goal of Cognitive Restructuring without ever feeling like homework.

Journaling for Insights: Closing the feedback loop

Journaling for Insights: Closing the feedback loop

Habit-Building: Daily Reflection & Mood Tracking

A gentle, validating AI response that meets users exactly where they are — no judgment, just support

Calm, personal, and built for moments of crisis

The Empaithy Experience

Accessibility & Inclusion

Real User-Testing

· 100 % task completion
·  Avg. perceived effort: 1.8 / 7
· 12 participants with Anxiety, ADHD, dyslexia, and low vision

Neurodiversity-First

·  Voice-first entry
· No time pressure
· Plain language mode
· Dark mode default for sensory sensitivity

WCAG Compliant

· 4.5:1 contrast minimum
· Full keyboard navigation
· Screen-reader tested labels
· Dynamic type support up to 200%

Real User-Testing

· 100 % task completion
·  Avg. perceived effort: 1.8 / 7
· 12 participants with Anxiety, ADHD, dyslexia, and low vision

Neurodiversity-First

·  Voice-first entry
· No time pressure
· Plain language mode
· Dark mode default for sensory sensitivity

WCAG Compliant

· 4.5:1 contrast minimum
· Full keyboard navigation
· Screen-reader tested labels
· Dynamic type support up to 200%

Accessibility & Inclusion

Real User-Testing

· 100 % task completion
·  Avg. perceived effort: 1.8 / 7
· 12 participants with Anxiety, ADHD, dyslexia, and low vision

Neurodiversity-First

·  Voice-first entry
· No time pressure
· Plain language mode
· Dark mode default for sensory sensitivity

WCAG Compliant

· 4.5:1 contrast minimum
· Full keyboard navigation
· Screen-reader tested labels
· Dynamic type support up to 200%

Results

Market Credibility

4.7
from 5,000 users on the Google Play store
"Been going through a lot.. didn't have any options of venting out to real people in my life so this was perfect. Thank you".-Gurteg S.

-1.5
-1.0
-0.5
-0.0
-0.5
-1.0
-1.5
-2.0
-2.5

Avg. Emotional Impact Score: Pre vs Post session

Emotional Impact Scale(-2 to +3 Scale)

+0.7

-1.0

Emotional Impact Score

1.7+

Avg. Mood boost per session

User Loyalty

+44 NPS

First-Month Retention (M1) of 28% which improved to36% by the third month (M3)

Results

Emotional Impact Score

1.7+

Avg. Mood boost per session

Measured using a 5-point emotional state scale (-2 to +3) before and after a session

-1.5
-1.0
-0.5
-0.0
-0.5
-1.0
-1.5
-2.0
-2.5

Avg. Emotional Impact Score: Pre vs Post session

Emotional Impact Scale(-2 to +3 Scale)

+0.7

-1.0

from 5,000 users on the Google Play store

Market Credibility

User Loyalty

+44 NPS

First-Month Retention (M1) of 28% which improved to36% by the third month (M3)
4.7
"Been going through a lot.. didn't have any options of venting out to real people in my life so this was perfect. Thank you".-Gurteg S.

The three artifacts that defined everything

Van Sarna

System v2.6 // 2026

Remote across EMEA

Home

About

Projects

Contact

Projects completed

Projects in Progress

113
4