17/6/2019Machine Learning

Everything You Need to Know Before Building a Recommendation System

Tim Clayton

8 min read

AIEcommerceData ScienceRetail

In the first article of this series about Recommendation Systems, we discussed how they work and gave some simple tools that anyone can use to get started. The best place to start is to check out that article, but if you need a quick refresher, here are the basics:

Recommendation Systems collect data on user behavior and predict which products or services will be of interest to them
When users have similar histories and preferences, the system presents User B with recommendations that User A liked
The basic job of a system is to fill in the gaps, deciding if a product that a user has not seen or interacted with will be of interest to them

If you want the simplest possible example, click ‘Your Discover Weekly’ on Spotify and you will be given a list of songs that you have never played on the platform but which the recommendation system believes you will like based on your listening history.

In the next two articles, we are going to go deeper, starting with some of the most common questions that come up when businesses think about building a Recommendation System.

Are Recommendation Systems good for any business?

The first question you need to ask when creating a Recommendation System (RS) is whether your users are generating enough data to make it worthwhile. It costs time and money to implement a good system. If your solution is not going to be effective for business, your valuable resources would be better spent elsewhere.

In Spotify, YouTube, Facebook, Netflix or Amazon, regular users are looking for ever-improving personalized experiences. As they use the system more, the recommendations should become more specific and more accurate.

If people only visit your store once or twice, you can’t offer that experience; but that doesn’t mean an RS is useless. Rather than having a personalized system, you can collate the behavior of all your users to make more general recommendations. For example, if a new user clicks on a specific coat in your store, you can recommend a scarf that was also purchased by previous customers who bought the coat.

In this scenario, almost all businesses can benefit from Recommendation Systems; the only time that it is unlikely to have any business value is if you have a limited product range and very low site traffic. If you offer only a few high-value products, which you sell infrequently, then the effort behind an RS may not be worth it.

How many data points do we need to start?

How long is a piece of string? Many people accept the minimum threshold of 1,000 respondents before survey results can be published. With Recommendation Systems — as with opinion polls — the rule of thumb is “the more data the better”.

If we only sold 8 packs of diapers in our online store, but everyone who bought diapers also purchased diaper disposal bags, we have a very limited data set but a 100% correlation. It’s fair to say that our store should recommend disposal bags to every new user who views diapers. Of course, the trend may change as we gather more data over time, but the initial assumption is solid. We don’t need 1,000 people buying both products before we call it an upselling opportunity.

A good Recommendation System is one that can spot trends as soon as they emerge and use future responses to confirm or disprove the assumption. And more data does not always mean clearer insight. And if a million people like Game of Thrones on our ratings site and 50% of those people also like True Detective, we can’t draw any strong conclusion.

When is one user a good fit for another?

Everyone is unique. My former best friend and I agreed totally on music and films. We were 100% matched until he told me that he doesn’t like Pearl Jam… We never spoke again.

An RS will never be able to match User A to User B with perfect results, as their tastes will diverge at some point. If two users both rate 9 bands as good but differ in their opinion of the tenth, they are still a very strong match. If User A rates the eleventh band as good, we should still assume User B will like them. But at what point do users stop being a good match? 90% compatibility? 80… 60… 40?

The truth is, recommendation engines don’t set a threshold when looking for compatibility between people; they look for the best possible match. If we have a user, Alice, who has a 95% compatibility with Bob, 80% with Carol, and 30% with Dave, the system will start by looking at Bob when deciding if Alice will like a certain band. If Bob has not given a response to that band’s music, the system will move on to Carol, and then to Dave. The more users we have in the data set, the more likely it is that a person with a high compatibility will give a response that the system can leverage.

Are user matches or product matches better?

We are discussing two basic types of recommendation here. The first is user-to-user, in which we see that User A is similar to User B and make suggestions to User B based on what their ‘digital twin’ bought or browsed. The second is product bundling, in which our scarf and coat were popular choices together, as were the diapers and the disposal bags.

The best Recommendation Systems are able to utilize both sets of parameters to extrapolate the very most from the data. But there are rules of thumb that are worth considering. For example, in an online store, while your customers are browsing products, you might want to focus on user-to-user recommendations and show them products that similar customers liked. Once they are in the checkout and have the items in their cart, you might want to push product bundles. For example, if someone just has the diapers in their cart, a good system would remind them to add disposal bags before they pay the bill.

When should I implement a cold start engine?

Cold starts, as described in the previous article, are when a new user comes to a site and we know nothing about them. So, how can we start making recommendations to give them the best possible experience from their first visit?

Users have almost infinite choice and you have only a short time to grab their attention. Give people a personalized experience first time out.

Sites like Netflix ask new users to select a few movies they are interested in before entering the site for the first time. If the new user chooses ten romcoms, Netflix will probably suggest Maid In Manhattan and The Big Sick.

Whether you need a cold start engine depends on your product range. If you are a men’s hiking apparel store, you have a niche of products and your users are likely to be men who like day trips in the mountains. You don’t need to refine the content the user sees.

However, if your clothing store sells to women and men of all ages, across a range of fashions and functionalities, you might want to work out who is viewing your page so you can present them with items they might buy. In this case, your cold start engine might be a dozen pictures of men and women in different clothing. By asking your new user to click on the picture that matches their style, you can find out in one click that your new, unique site visitor is a teenage girl who likes a sporty style.

Of course, you can quickly find out the same information as the user starts browsing, but one click might be the difference between a satisfied customer and a bored browser, so it is worth it if you have a wide customer demographic and a large product range.

How do I know if it is working?

The bottom line is always the bottom line. If your Recommendation System is working, you will see a growth in sales. However, if we assume that a 1% growth in sales is a significant change, how do we know that the success is down to our Recommendation System and not our marketing, or product trends, or the failures of our competitors? If we are investing in something, we need to know that it works and is worth pouring more resources into.

The answer, A/B testing, is also a key, standard feature of our upcoming Saleor Cloud solution. By creating 2 different versions of our store — one which utilizes our Recommendation System and one which does not — different customers see different versions of the storefront and we are able to observe which is more effective, both in general and for sales of specific products.

Essentially, we use hard data to check if our data-driven recommendations are working. Is there really any other way?!

If you are thinking about building a Recommendation System for your e-commerce, feel free to get in touch and speak to us about your project.

Mirumee guides clients through their digital transformation by providing a wide range of services from design and architecture, through business process automation to machine learning. We tailor services to the needs of organizations as diverse as governments and disruptive innovators on the ‘Forbes 30 Under 30’ list. Find out more by visiting our services page.

m-zine