Duolingo Case Study: Beyond the Streak

About this work

What I did

This is an independent analysis conducted as part of my product analyst portfolio. All data is sourced from Duolingo's publicly available shareholder letters, product blog, and published research. It was not commissioned by Duolingo and does not reflect any internal information.

My role was to identify a retention problem from public signals, diagnose the root cause using behavioral frameworks, evaluate alternative solutions, and propose a measurable recommendation, the same workflow I apply as a product analyst embedded in a product team.

Tools and methods: Self determination theory, Fogg Behavior Model, Goal setting theory, Prospect Theory, secondary research synthesis, A/B test design, metric definition, and product recommendation writing.

The Challenge

The best alarm clock in the world

Duolingo is the world's most downloaded language learning app with $1 billion in annual revenue. For years it grew by solving one problem better than anyone else: getting users to come back every single day.

Its streak feature became one of the most studied retention mechanics in consumer tech. The team ran over 600 experiments on it. Daily active users grew 4.5 times in four years by optimising a single metric, Current User Retention Rate (CURR), which measures how many active users return the next day.

But by 2025, growth was decelerating. CURR was approaching a natural ceiling. Duolingo's own data team acknowledged that 90% of daily active users had collapsed into a single behavioral state, making the metric too blunt to move further.

Duolingo had built the world's best alarm clock. Now it needed to give people a better reason to get out of bed.

Symptom

Daily active user growth decelerating from 65% to 36% year over year

Root cause

A single extrinsic retention mechanic applied uniformly to users with fundamentally different motivations

Why the streak has limits

The streak works through loss aversion, a principle from prospect theory (Kahneman & Tversky, 1979). Once a user builds a 30 day streak, breaking it feels disproportionately costly. This is why it works so well short term.

The problem is that loss aversion is extrinsic. It motivates people to protect something they have, not to pursue something they want. Self determination theory (Deci & Ryan, 1985) shows that extrinsic motivation can crowd out intrinsic motivation over time. A user who opens Duolingo solely to protect a streak is one streak break away from churning permanently.

The motivation gap

A Duolingo survey of over 15 million US users identified five primary learning motivations: school (22%), brain training, travel (~11%), work (~11%), and family connections (~11%). These motivations respond to completely different product experiences.

A student learning Spanish for an exam wants proficiency benchmarks. A professional learning Mandarin for business travel wants conversational readiness. A grandparent learning Portuguese to speak with family wants emotional progress, not a daily attendance counter.

The Fogg Behavior Model (Fogg, 2009) holds that behavior requires motivation, ability, and a prompt. The streak provides a powerful prompt and sufficient ability, but does nothing to address motivation for users whose personal goal is not aligned with showing up every day.

A user with a 200 day streak who cannot order coffee in French has high engagement and low outcome attainment. That user is a churn risk the moment the streak breaks. The streak tells users they showed up. It does not tell them they are getting closer to what they came for.

The Solution

Learner Identity

I evaluated three options before landing on a recommendation.

Option A

Deepen the streak mechanic further

Add milestone rewards, social visibility, and protection mechanics. Low risk but diminishing returns are already documented. Addresses the symptom, not the root cause.

Option B

Build a full adaptive learning system

Use machine learning to infer motivation from behavior automatically. Powerful in theory but noisy for new users, and 18 to 24 months from meaningful impact.

Option C — Recommended

Learner Identity

Capture motivation with a single question at a natural checkpoint, then personalize the retention experience. Directly addresses the root cause, executable within existing infrastructure, testable within one quarter.

How it works

At the Day 7 streak milestone, Duolingo asks users one question: what does success look like for you? Five choices: pass a test or reach a certification, hold a real conversation, connect with family or a partner, stay mentally sharp, travel.

That response drives three downstream changes:

Progress indicator shifts from a streak counter to a goal-relevant metric. Exam-focused users see a CEFR level tracker. Conversation-oriented users see conversations completed. Connection-oriented users see a personalized milestone tied to their language and context.

Push notifications reference the stated goal instead of streak protection. "You are three levels from B1 Spanish" is more motivating for an exam learner than "Don't break your streak."

Feature recommendations surface content aligned with the motivation. Conversation-oriented users see an AI roleplay preview with a paid tier prompt. Exam-oriented users see practice tests and certification content.

Casual brain trainers receive no changes. Streaks, leaderboards, XP, and gamification stay exactly as they are.

Measurement

How I would test it

Primary

30 day retention rate by motivation segment, targeting a 5 to 8 percentage point lift for goal-oriented, conversation-oriented, and connection-oriented segments relative to control

Secondary

Free to paid conversion rate for goal-oriented and conversation-oriented users. Time to 7 day streak for new users.

Guardrail

Overall CURR and average session length. Any degradation triggers an immediate hold on rollout.

Test design

Holdout experiment with four arms (control plus one per non casual segment), new users in week one, 90 days minimum, pre-registered action plans for win and loss outcomes.

Implementation phases

1

Validate the assumption

Conduct 30 qualitative user interviews. Pull warehouse data to test whether motivation at signup predicts 30 day retention. Validate before building anything.

Owner: Product analyst and user research | Timeline: 6 weeks

2

Build

Design the Day 7 motivation prompt. Build three progress indicator variants. Update notification copy logic. Instrument with event-level logging.

Owner: Product manager, engineering lead, design lead | Timeline: 10 weeks

3

Test

Run the holdout experiment. Report nightly on guardrail metrics. Pre-register action plans before launch.

Owner: Product analyst, data engineering | Timeline: 12 weeks

4

Roll out

If primary metric moves 5 or more percentage points for at least two segments with no guardrail degradation, roll out to 100% of new users.

Owner: Product manager

What I Learned

Three takeaways

01

The most important metric is not always the most visible one. CURR was the right metric for four years and then became a ceiling. Knowing when a metric has done its job is as important as knowing how to move it.

02

Behavioral theory is a shortcut to better product hypotheses. Connecting the streak to prospect theory and loss aversion helped explain not just what was happening but why, and pointed directly toward the fix. Frameworks are not academic decoration. They are diagnostic tools.

03

The gap between engagement and outcome is where most products lose users. A streak measures attendance. It does not measure progress. Any product that cannot answer "did this user get what they came for?" is building on a fragile foundation.

References

Sources

Bulaev, S. (2025). Duolingo's AI first shift: Growth, jobs, and the future of learning. Bulaev. bulaev.net
Deci, E. L., Koestner, R., & Ryan, R. M. (1999). A meta analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125(6), 627–668. doi.org
Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self determination in human behavior. Plenum Press.
Duolingo. (2023). Special report: Understanding English learners around the world. blog.duolingo.com
Duolingo. (2024). Q4/FY 2024 shareholder letter. investors.duolingo.com
Duolingo. (2025a). Q3 2025 shareholder letter. investors.duolingo.com
Duolingo. (2025b). Q4 and full year 2025 earnings release. investors.duolingo.com
Duolingo. (2025c). The Duolingo streak uses habit research to keep you motivated. blog.duolingo.com
First Round Review. (n.d.). The tenets of A/B testing from Duolingo's master growth hacker. review.firstround.com
Fogg, B. J. (2009). A behavior model for persuasive design. Proceedings of the 4th International Conference on Persuasive Technology (Article 40). ACM. doi.org
Gustafson, E. (2023). Meaningful metrics: How data sharpened the focus of product teams. Duolingo. blog.duolingo.com
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291. doi.org
Locke, E. A., & Latham, G. P. (1990). A theory of goal setting and task performance. Prentice Hall.
Mazal, J. (2023). How Duolingo reignited user growth. Lenny's Newsletter. lennysnewsletter.com
Shuttleworth, J. (2024). Behind the product: Duolingo streaks. Lenny's Newsletter. lennysnewsletter.com

Duolingo: Beyondthe Streak

What I did

The best alarm clock in the world

Why the streak has limits

The motivation gap

Learner Identity

How it works

How I would test it

Implementation phases

Three takeaways

Sources

Duolingo: Beyond
the Streak