The PM's Guide to Recommendation Systems

At some point in every PM’s career at a platform company, someone from engineering will walk into a meeting and say “we’re going to add recommendations” and everyone will nod enthusiastically because recommendations are obviously good and then spend the next six months producing results that are technically a recommendation system and empirically kind of useless.

I’ve been in that meeting. I’ve run that meeting. Here’s what I wish I’d known.

What a Recommendation System Is (For the Non-Engineer PM)

A recommendation system takes in what it knows about a user and what it knows about items, and predicts which items the user is most likely to engage with.

That sounds simple. The complexity hides in three words: “what it knows.”

There are two main flavors of recommendation:

Collaborative filtering: “People who did what you did also did this.” The system finds users similar to you and recommends what they liked. Netflix’s “Because you watched X” is the canonical example. It works remarkably well at scale, and completely fails when you’re new (the cold start problem — no history, no signal, no helpful recommendation).

Content-based filtering: “Here’s something similar to what you’ve already engaged with.” The system analyzes the attributes of items you’ve interacted with — genre, author, topic, difficulty, style — and finds other items with matching attributes. Works for new users (you can ask preferences explicitly), fails when a user’s tastes change or when you want to expand their horizon.

Most modern systems are hybrid, blending both. And most platforms add a third layer: business logic overrides — rules that the algorithm doesn’t know about but your business requires (boost new creator content, suppress low-quality items, ensure category diversity).

That third layer is usually where PMs live.

What Recommendations Are Actually For

This sounds obvious, but it’s worth saying: recommendations are not for users, they’re for your business goals.

Or more precisely: they’re for the intersection of user value and business value. When those align — when recommending the right thing at the right time is both useful to the user and good for your retention metrics — recommendations are magic. When they diverge, you have a design problem.

At Skillshare, we thought carefully about what “good recommendation” meant in the context of a subscription platform. In a marketplace with per-unit revenue, more engagement always equals more revenue. But in subscription, engagement is the output metric, not the proxy. You want deep engagement (completing courses, building skills) more than you want shallow engagement (click 10 things, watch 3 minutes of each).

This sounds obvious in principle. In practice, your recommendation system will optimize for whatever signal you give it — and clicks are easy to measure, while skill acquisition is not. The most dangerous outcome is a recommendation system that’s good at producing clicks and slowly destroys retention because users feel they’re not making progress.

The Engagement Trap

I call this the engagement trap: optimizing recommendations for engagement signals (clicks, views, session length) while your actual goal is a quality outcome (learning, skill acquisition, entertainment satisfaction, task completion).

It’s not unique to education. Netflix ran into this for years — watch time as the optimization target incentivized content that was compulsive rather than satisfying. Passive binges, not remembered favorites.

The solution is to think carefully about your outcome metric and then ask: “What user behavior, if it increases, actually predicts that outcome improving?”

For a language learning app: not opens per week, but “days in a row with deliberate practice sessions.” For a creative platform: not courses started, but “projects created after completing a course.” For a marketplace: not content browsed, but “purchase or contact within a session.”

Your recommendation system should optimize for these behavioral signals. Engineering will tell you it’s harder to collect. They’re right. Do it anyway.

The Cold Start Problem and What to Do About It

Every recommendation system struggles with new users. No history = no signal = the algorithm defaults to what’s globally popular, which is often not relevant.

Solutions:

Onboarding questionnaires: Ask new users about goals and interests. Use explicit preference signals when behavioral signals don’t exist yet. Keep this lightweight — 3-5 questions max, or users will abandon.
Trending content by segment: If you don’t know a user’s preferences, at least know their context. A new user coming from a “beginner Python” search query doesn’t need your most popular content — they need your best beginner Python content.
Quick wins first: In the first session, optimize for completion. Show short, high-rated content that a new user is likely to finish. Completion gives you a signal you can use for everything else.

At Skillshare, we spent more time on first-session experience than on the core recommendation engine — because nothing the algorithm does later matters if users don’t come back after the first 15 minutes.

Content That Recommendations Can’t Replace

The instinct when building recommendations is to believe they’ll solve discovery. They won’t.

Recommendations work when the user’s intent is partially formed — they know roughly what they want, the system refines and expands. They struggle when:

The user’s need is highly specific (“I need a course on industrial SCORM packaging for a client deliverable next Tuesday”)
The user wants to browse and get surprised (“I don’t know what I’m in the mood for, show me something interesting”)
The user wants to follow a person, not an algorithm (“I trust this creator, show me their work”)

The first case is search’s job. The second is editorial curation’s job. The third is subscription or follow mechanics’ job.

Good recommendation systems complement all three. They don’t replace any of them.

How to Evaluate If Your Recommendation System Is Working

The metrics that matter, roughly in order of difficulty to collect:

Click-through rate on recommended items — easy, and necessary, but not sufficient
Engagement depth on recommended items (completion rate, time spent on recommended vs. self-discovered content)
Repeat engagement from recommended content (does discovery lead to a follow, a second purchase, a new genre explored?)
Long-term retention difference between users who engaged heavily with recommendations vs. those who didn’t
Serendipity signal — did users discover something outside their normal profile? This matters more than most platforms acknowledge, because discovery of the unexpected is what makes a platform feel alive.

Most product teams instrument 1, sometimes 2, rarely the rest. All five matter if you want to know whether your recommendation system is actually working.

The Question That Simplifies Everything

When I’m reviewing a recommendation feature with an engineering team, I start with: “If a knowledgeable human editor made this recommendation, what would they know that the algorithm doesn’t?”

The answer to that question is your roadmap for improving the system. Usually it’s: they’d know the user’s current goal (the algorithm knows history but not intent). Or: they’d know that this creator just published better work (the algorithm treats all content from a creator equally). Or: they’d know that this topic has seasonal relevance the user cares about.

A great recommendation system tries to know what a great human editor would know.

You don’t have to be an ML engineer to build one. You just have to think carefully about what good looks like, and relentlessly measure the gap between that and what you have.

Recommendation system design was a core part of my work at Skillshare on the Creative Feed and marketplace discovery. See the Skillshare Creative Feed case study and Marketplace case study for the full story.