Non-Personalized and Stereotyped

Stereotyped(刻板印象)

Non-personalised rating is that the product that people give their rate to it and we say that rate works for all people. Generally, everyone receive the same recommendation.

Preference Model

A recommend system is to find the preference of the users or the products and do the recommendation.

The preference can be divided into 2 parts:

Explicit: Rating, Review and Vote

These actions are the users straight tell their preference to the system how them like it.
Implicit: Click, Purchase and Follow

Here we can not find the direct preference but can still find something useful about their preference

For the rating, we may have

continuous scales, which can used to compare each other
pairwise: like and dislike, up vote and down vote
Temporary: it only remember the rate for a period

The rating can also be designed in anytime

Consumption: during and immediately after experiencing the item.
Memory: some tiem after the experience
Expectation: the item has not been experienced. This type is ussually for the high cost products like house or cars that people do not usually have lot.

For the implicit preference such like the click in the website and how long user read a book. It may not have the direct signal that how users like or not like compare to the explicit method. But it have large volume data that can be used to find the preference of users.

Implicit ratings are always easier to collect than the explicit ratings.

Prediction and Recommendation

Both these 2 are relative. Prediction are try to given how much they like an items in hte future and the Recommendation select the most relevant item in a list.

Prediction

Pro: it can quantify the item
Con: sometime provide the false result that the user may never trust it anymore

Recommendation

Pro: provide good choices as a default
Con: if we used top-n we may have some false item if the relevant items in top-n items are poor.

Also that people have their natural behavior that we reject other to tell them what them like without given any reasons. Then recommendation will just say these are the items that you probably like because other user think they are good are called “softer sell”.

Content Based recommendation

Model = User rating $\times$ Item Attributes

Here we have a user model and the content dimensions. The rating and item form the content dimensions.

And this model applied these rating via attributes

Knowledge Based recommendation

Item attributes form model of item space

Trust Based recommendation

Where we try to find other users who has the similar taste as me and recommend the their preference for me.

Content Based

Normally, content based system uses item features to recommend other items similar to what the user likes.

Using Dot Product as a Similarity Measure

Considering the user embedding x and the item embedding y, we can use dot product of $x\cdot y$ and the large value means high similarity.

Advantages

The model doesn’t need any data about other users, since the recommendations are specific to this user. This makes it easier to scale to a large number of users.
The model can capture the specific interests of a user, and can recommend niche items(少数人喜欢的) that very few other users are interested in.

Disadvantages

Since the feature representation of the items are hand-engineered to some extent, this technique requires a lot of domain knowledge. Therefore, the model can only be as good as the hand-engineered features.
The model can only make recommendations based on existing interests of the user. In other words, the model has limited ability to expand on the users’ existing interests.

Collaborative Filtering(协同过滤)

Nearest Neighborhood

Collaborative filtering uses *similarities between users and items to provide recommendations. Collaborative filtering models can recommend an item to user A based on the interests of a similar user B. Furthermore, the embeddings can be learned automatically, without relying on hand-engineering of features.

Matrix Factorization

Matrix factorization is a simple embedding model. Given the feedback matrix A ∈Rm×n, where m is the number of users (or queries) and n is the number of items, the model learns:

A user embedding matrix $U\in Rm \times d$ $U \in R m \times d$ , where row i is the embedding for user i.
- An item embedding matrix $V\in R n \times d$ , where row j is the embedding for item j.

Illustration of matrix factorization using the recurring movie example.

The range for each value from -1 to 1 means negative to positive and each element in the rightside matrix is the product between user and item.

user-based

Find the similarity function measure the distance between the each pairs of users
Predicting the rating for a item that the users did not gave a rank before.

user-based may cost lot computation online based on the users similarity but may give the user new/surprising items as recommendation.

Item-based

Find the similarity function measure the distance between the each pairs of items
based on the user’s rank on the item recommend the similar item to him

Advantages

*No domain knowledge necessary*

We don’t need domain knowledge because the embeddings are automatically learned.

*Serendipity*

The model can help users discover new interests. In isolation, the ML system may not know the user is interested in a given item, but the model might still recommend it because similar users are interested in that item.

*Great starting point*

To some extent, the system needs only the feedback matrix to train a matrix factorization model. In particular, the system doesn’t need contextual features. In practice, this can be used as one of multiple candidate generators.

Disadvantages

*Cannot handle fresh items*

The prediction of the model for a given (user, item) pair is the dot product of the corresponding embeddings. So, if an item is not seen during training, the system can’t create an embedding for it and can’t query the model with this item. This issue is often called the cold-start problem. However, the following techniques can address the cold-start problem to some extent:

Projection in WALS. Given a new item i0 not seen in training, if the system has a few interactions with users, then the system can easily compute an embedding vi0 for this item without having to retrain the whole model. The system simply has to solve the following equation or the weighted version:

minvi0∈Rd‖Ai0−Uvi0‖

The preceding equation corresponds to one iteration in WALS: the user embeddings are kept fixed, and the system solves for the embedding of item i0. The same can be done for a new user.
Heuristics to generate embeddings of fresh items. If the system does not have interactions, the system can approximate its embedding by averaging the embeddings of items from the same category, from the same uploader (in YouTube), and so on.

*Hard to include side features for query/item*

Side features are any features beyond the query or item ID. For movie recommendations, the side features might include country or age. Including available side features improves the quality of the model. Although it may not be easy to include side features in WALS, a generalization of WALS makes this possible.

To generalize WALS, augment the input matrix with features by defining a block matrix A¯, where:

Block (0, 0) is the original feedback matrix A.
Block (0, 1) is a multi-hot encoding of the user features.
Block (1, 0) is a multi-hot encoding of the item features.

Model based recommendation