Machine Learning for Recommendation System



Introduction

Reference companies in the e-commerce market use the benefit of recommendation systems as one of their main artificial intelligence tools, with approximately 30% of the revenues from these businesses being the result of the recommendation system.

The recent transformations in the consumer market, mainly post-pandemic, have directed a massive consumption of the e-commerce and, therefore, there is an exponential growth of this segment, such as the opening of new e-commerce businesses and transformation of physical trades in e-commerce and that is where The recommendation system assumes an increasingly important role in directing sales and sustaining the model of most businesses.

Therefore, given the importance and positive impact on revenues, it is considered to invest resources in this type of tool and even use cloud services with premium technical support.


Objective

The objective of this article is to develop a machine learning tool that will be part of a recommendation system development project within a macro context, where we will have other infrastructure projects integrated with this machine learning project.

This article describes the main characteristics of the recommendation system, the types of recommendation system, and presents the step by step of a real case of developing a recommendation system and the algorithm used with its respective accuracy and properties.

For this project we will use Azure Machine Learning by the Azure cloud, creating all the necessary cluster computing resources for data collection, processing, and deployment.

We will simulate a real scenario of deployment, however, as it is an article with an academic purpose without a budget to simulate a real production scenario, we will use a free account with an initial balance of 250 dollars for testing purposes. Therefore, in the production environment, it is limited and the resources available are resources with low production capacity and performance.

The recommendation algorithm that we will use was developed by Microsoft Research and trains a Bayesian recommender using the Matchbox algorithm.

The Matchbox model reads a set of triple user-item classification data and, optionally, user and item resources.

We can use the triennial model to make recommendations and find users or related items.


Recommendations can be of two types:

Content-based: Uses resources for users and items.

Collaborative filtering: Makes use of user and item identifiers and learns about the user based on the items classified by him and other users.


The matchbox is a hybrid system, as it uses a combination of collaborative filtering and content-based filtering. When the user is new, the model makes predictions based on the user's profile and after counting interactions, it is possible to make more personalized recommendations based on their ratings and not just their features, so a transition from an initial model based on in user characteristics , for more personalized recommendations based on collaborative filtering.

Further official information on the Matchbox model can be found at the official link below.


https://www.microsoft.com/en-us/research/wp-content/uploads/2009/01/www09.pdf

For testing the classification model we have the set of mandatory data below:


• The first column must contain user identifiers.

• The second column must contain item identifiers.

• The third column contains the classification of the user-item pair. Classification values ​​must be numeric or categorical.


We can use the optional data regarding the characteristics of the user and the item based on content, however, for the purposes of processing efficiency and because it is an academic test environment with limited resources, we will develop our project based on the recommendation model of collaborative filtering, where the recommendations will be presented through the classification of the restaurant item by the user, therefore, after the classification of a restaurant item by the user, we will present personalized recommendations based on their preferences and in context with the preferences history of others users of our database used by our algorithm.

It is worth remembering that during the training process, only the triple data set for classifying user items are divided and the resources of users and items are not divided because there is no need.

It is important to remember that the number of characteristics of the users and selected items, as well as the number of iterations of the recommendation algorithm, that is, how many times the input data must be processed by the algorithm, influence the accuracy performance and the processing time of the model.

The matchbox model has a restriction that makes it impossible to carry out continuous training, so, good practices recommend to carry out periodic training considering new data to improve the model, which can be this quarterly period or according to the business need.

So, now we are going to start our process of developing our recommendation system by presenting the step by step, with a tutorial using the Azure cloud and the respective screenshots for a better understanding of the whole process by the reader.


Create an Azure Machine Learning workspace

To use Azure Machine Learning, you create a workspace in your Azure subscription. You can then use this workspace to manage data, compute resources, code, models, and other artifacts related to your machine learning workloads.

If you don't already have one, follow these steps to create a workspace:

  1. Sign into the Azure portal using the Microsoft credentials associated with your Azure subscription.

  2. Select +Create a resource, search for Machine Learning, and create a new Machine Learning resource the following settings:

  • Workspace Name: A unique name of your choice

  • Subscription: Your Azure subscription

  • Resource group: Create a new resource group with a unique name

  • Location: Choose any available location


  1. Wait for your workspace to be created (it can take a few minutes). Then go to it in the portal.

  2. On the Overview page for your workspace, launch Azure Machine Learning studio (or open a new browser tab and navigate to https://ml.azure.com ), and sign into Azure Machine Learning studio using your Microsoft account. If prompted, select your Azure directory and subscription, and your Azure Machine Learning workspace.

  3. In Azure Machine Learning studio, toggle the ☰ icon at the top left to view the various pages in the interface. You can use these pages to manage the resources in your workspace.

You can manage your workspace using the Azure portal, but for data scientists and Machine Learning operations engineers, Azure Machine Learning studio provides a more focused user interface for managing workspace resources.


As we already have an account created, every time we log in, the authentication process is carried out by text message via cell phone.

This process increases security and as the expenses are paid-as-you-go, this process creates greater protection for the user's account.


Fig.1 Signing the account created