Table of contents
First, the role of the BPR algorithm
2. Explicit feedback and implicit feedback
First, the role of the BPR algorithm
Sort all the products corresponding to each user according to their preferences. A simpler idea is that the priority of the items that the user has interacted with must be higher than the priority of the items that have been interacted with. This is the core content of the BPR algorithm.
2. Explicit feedback and implicit feedback
Explicit feedback: The user explicitly likes and dislikes items, such as the user's rating.
Implicit feedback: The user has browsed the product, and did not clearly indicate whether he likes or dislikes it. This type can only be regarded as positive feedback, that is, the product he likes. The user's interactive behaviors on products, such as purchase and browsing, are all implicit feedbacks.
1 Features of explicit and implicit feedback
implicit feedback | explicit feedback | |
---|---|---|
Accuracy | end | high |
plentiful | high | end |
retrieve data | easy | difficulty |
data noise | harder to identify | easier to identify |
context sensitive | Yes | Yes |
Ability to express user preferences | Only positive samples | Include positive and negative samples |
Evaluation Comparison Criteria | Relative comparison | absolute comparison |
1. Explicit feedback
Table 2-2 Evaluation method of explicit feedback model
Model decision recommended | Model decision not recommended | |
Recommended (retrieved) in the test set | True positives(TP positive class is judged as positive class) (what you found is also wanted) | False positives (FP negative class is judged as positive class) (found but useless) |
Not recommended in the test set (not retrieved) | False negatives (FN positive class is judged as negative class) (not found, but the actual desired | True negatives (TN negative class is judged as negative class) (it is useless if it is not found) |
Let's assume a specific scenario as an example.
Suppose a class has 80 boys and 20 girls, for a total of 100 students. The goal is to find all girls.
Someone picked out 50 people, 20 of them were girls, and also picked 30 boys as girls by mistake.
You as an evaluator need to evaluate his work.
We need to define four classifications of TP, FN, FP, and TN first.
According to the previous example, we need to find all girls from the people in a class. If this task is regarded as a classifier, then girls are what we need, but boys are not, so we call girls "positive" and boys for the "negative class".
Relevant, positive class | irrelevant (NonRelevant), negative class | |
Retrieved | TP=20 true positives(TP positive class is judged as positive class, in the example is the correct judgement "this is a girl") | FP=30 false positives(FP negative class is judged as positive class, "preserving false", in the example, it is clearly a boy but judged as a girl) |
Not Retrieved | FN=0 false negatives(FN positive class is judged as negative class, "go to true", in the example, it is clearly a girl, but this buddy judges it as a boy - this is the mistake made by Liang Shanbo) | TN=50 true negatives(TN negative class is judged as negative class, that is, a boy is judged as a boy) |
Through this table, we can easily get the values of these categories in the example: TP=20,FP=30,FN=0,TN=50.
We can evaluate the quality of the model classification by calculating the accuracy of the model classification, and the recall rate by calculating the value of F-Measure.
The following combines the values of these categories in the example: TP=20,FP=30,FN=0,TN=50 to introduce the concept and calculation method of precision rate, recall rate, and F-measure.
2. Implicit feedback
Implicit feedback data has many disadvantages, such as unclear and noisy data, but because of its widespread existence, we can only use it sometimes, so we still need to study it in detail, through reasonable noise reduction and data pruning of implicit feedback. Improve the feasibility of item recommendations.
Explicit feedback data can show the user's preference value for a certain item, such as the scoring mechanism, the difference between 8 points and 10 points, while the implicit feedback data cannot measure the preference value. It can only be considered that the more users browse the same content, the more It is possible to like this content, that is, the greater the confidence.
How Implicit Feedback Data Is Processed
In the case of using implicit feedback, we will find that the observed data are all positive (observed because the user has interacted with the item), and the data that is not observed (that is, the user has not yet produced a behavior) item), divided into two cases, one is that the user does not have a week for the item (negative class), the other is a missing value (that is, the item that the user may act on in the future), while in the traditional personalized recommendation Usually, the user u's personalization score for item i is calculated and then sorted according to the personalization score. The way it processes the data is to treat all observed implicit feedbacks as the positive class and the rest as the negative class.
In the case where the negative class is filled with zeros, our optimization goal becomes that the data observed during prediction is predicted to be 1, and the rest are 0, so the problem is that we want the model to predict the missing value in the future, They are considered to be negative data during training. Therefore, if the model is trained well enough, the most common result is that the final predicted values of these unobserved samples are all 0.
The comparison with the BPR algorithm is based on the implicit feedback data, and the item s are ranked by the maximum posterior probability obtained by Bayesian analysis of the problem, and then the recommendation is generated.
3. BPR algorithm
1. Concept
BPR (Bayesian Personalized Ranking), the Chinese name is Bayesian Personalized Ranking, is a commonly used recommendation algorithm in current recommendation systems. Different from other methods based on user rating matrix, BPR mainly uses users' implicit feedback (such as clicks, favorites, adding to shopping carts, etc.) sort, and then generate recommendations.
2. Relevant definitions
U represents the set of all user s; I represents the set of all item s;
S stands for implicit feedback from all users
As shown in the figure below, as long as the user has acted on an item (such as clicking, favorite, adding to the shopping cart, etc.), it is marked as +, and all + samples constitute S. Those are the observed data (that is, the data for which the user did not act) marked as ?
3. Modeling ideas
In the BPR algorithm, we mark the item corresponding to any user u. If user u acts on i when there are items i and j at the same time, then we get a triple <u,i,j> , which means that for user u, i is ranked higher than j. However, if a user has acted on two items at the same time, or has not acted at the same time, it cannot constitute a preference pair. If we have m groups of such feedback for user u, then we can get m groups of training samples corresponding to user u. It can be seen that BPR adopts the pairwise method.
Then note that for each triplet sample <u,i,j>, i must be an item that has acted, and j must be an item that has not acted, so only the data of + decomposed on the right side of the figure below is included , excluding - data.
Fourth, BPR optimization
5. Algorithm process
The following is a brief summary of the algorithm training process of BPR:
Input: training set D triplet, gradient step α, regularization parameter λ, factorization matrix dimension k.
Output: model parameters, matrices W,H
1. Randomly initialize the matrices W,H
2. Iteratively update the model parameters:
3. If W,H converge, the algorithm ends, output W,H, otherwise go back to step 2.
When we get W and H, we can calculate the ranking score of any product corresponding to each user u:Finally, some commodities with the highest ranking score are selected for output.
6. End
BPR is a sorting algorithm based on matrix decomposition, but compared with algorithms such as funkSVD, it is not a global scoring optimization, but a sorting optimization based on each user's own product preferences. Therefore, the idea of iterative optimization is completely different. At the same time, the requirements for the training set are also different. funkSVD only needs the two-tuples of user items corresponding to the rating data as the training set, while BPR requires the user's preference sorting triples for the training set.
Seven, code implementation
main function code
# !/usr/bin/env python # @Time:2021/4/6 19:21 # @Author: Huayang # @File:Basical BPR.py # @Software:PyCharm import random from collections import defaultdict import numpy as np from sklearn.metrics import roc_auc_score import scores ''' Function description:BPR class (contains the various parameters required) Parameters: none Returns: none ''' class BPR: #User number user_count = 943 #number of items item_count = 1682 #k topics, k number latent_factors = 20 #step size α lr = 0.01 #parameter λ reg = 0.01 #training times train_count = 10000 #Training set train_data_path = 'train.txt' #test set test_data_path = 'test.txt' #U-I size size_u_i = user_count * item_count # Randomly set U, V matrices (i.e. Wuk and Hik in the formula) matrices U = np.random.rand(user_count, latent_factors) * 0.01 #size doesn't matter V = np.random.rand(item_count, latent_factors) * 0.01 biasV = np.random.rand(item_count) * 0.01 #Generate an all-zero matrix of the size of the number of users * the number of items test_data = np.zeros((user_count, item_count)) print("test_data_type",type(test_data)) #Generate a one-dimensional all-zero matrix test = np.zeros(size_u_i) #Regenerates a one-dimensional all-zero matrix predict_ = np.zeros(size_u_i) #Get U-I data correspondence ''' Function description: Get through the file path U-I data Paramaters: Enter the file path to read path Returns: output a dictionary user_ratings,contains users-item key-value pair ''' def load_data(self, path): user_ratings = defaultdict(set) with open(path, 'r') as f: for line in f.readlines(): u, i = line.split(" ") u = int(u) i = int(i) user_ratings[u].add(i) return user_ratings ''' Function description: Get the test set data through the file path Paramaters: Test set file path path Returns: output a numpy.ndarray document( n dimensional array) test_data,Among them, the data containing feedback information is set to 1 ''' #Get the score matrix for the test set def load_test_data(self, path): file = open(path, 'r') for line in file: line = line.split(' ') user = int(line[0]) item = int(line[1]) self.test_data[user - 1][item - 1] = 1 ''' Function description: Process the data dictionary of the training set, and update the decomposed two matrices by randomly selecting, (user, interaction, for interaction) triplet Parameters: Input dictionary of training set user items to process Returns: Update the two decomposed matrices and the bias matrix separately ''' def train(self, user_ratings_train): for user in range(self.user_count): # Get a random user u = random.randint(1, self.user_count) #find a user # The training set and test set are not all the same, for example, train has 948, and test has a maximum of 943 if u not in user_ratings_train.keys(): continue # Randomly select 1 Item from the user's U-I i = random.sample(user_ratings_train[u], 1)[0] #Found an item, was rated # Randomly select an item that user u has not rated j = random.randint(1, self.item_count) while j in user_ratings_train[u]: j = random.randint(1, self.item_count) #Found an item, not rated #Form a triple (uesr,item_have_score,item_no_score) # Values in python start from 0 u = u - 1 i = i - 1 j = j - 1 #BPR r_ui = np.dot(self.U[u], self.V[i].T) + self.biasV[i] r_uj = np.dot(self.U[u], self.V[j].T) + self.biasV[j] r_uij = r_ui - r_uj loss_func = -1.0 / (1 + np.exp(r_uij)) # update 2 matrices self.U[u] += -self.lr * (loss_func * (self.V[i] - self.V[j]) + self.reg * self.U[u]) self.V[i] += -self.lr * (loss_func * self.U[u] + self.reg * self.V[i]) self.V[j] += -self.lr * (loss_func * (-self.U[u]) + self.reg * self.V[j]) # Update bias term self.biasV[i] += -self.lr * (loss_func + self.reg * self.biasV[i]) self.biasV[j] += -self.lr * (-loss_func + self.reg * self.biasV[j]) ''' Function description: Obtain the prediction matrix by inputting the decomposed user item matrix predict Parameters: User item matrix after input Returns: Output the multiplied prediction matrix, which is the scoring matrix we want ''' def predict(self, user, item): predict = np.mat(user) * np.mat(item.T) return predict #main function def main(self): #Get {1:{2,5,1,2}....} data of U-I user_ratings_train = self.load_data(self.train_data_path) #Get the score matrix for the test set self.load_test_data(self.test_data_path) #Flatten the test_data matrix for u in range(self.user_count): for item in range(self.item_count): if int(self.test_data[u][item]) == 1: self.test[u * self.item_count + item] = 1 else: self.test[u * self.item_count + item] = 0 #train for i in range(self.train_count): self.train(user_ratings_train) #10,000 training runs are completed predict_matrix = self.predict(self.U, self.V) #Inner product of matrices that will be trained # predict self.predict_ = predict_matrix.getA().reshape(-1) #.getA() converts its own matrix variable to a variable of type ndarray print("predict_new",self.predict_) self.predict_ = pre_handel(user_ratings_train, self.predict_, self.item_count) auc_score = roc_auc_score(self.test, self.predict_) print('AUC:', auc_score) # Top-K evaluation scores.topK_scores(self.test, self.predict_, 5, self.user_count, self.item_count) ''' Function description: Correct the result, that is, the user items that the user has already interacted with are eliminated, and only the interaction data that did not generate user items are retained. Paramaters: Enter the user item dictionary set, as well as the one-dimensional prediction matrix, the number of items Returns: Output a one-dimensional prediction matrix of the revised prediction scores ''' def pre_handel(set, predict, item_count): # Ensure the recommendation cannot be positive items in the training set. for u in set.keys(): for j in set[u]: predict[(u - 1) * item_count + j - 1] = 0 return predict if __name__ == '__main__': #call the main function of the class bpr = BPR() bpr.main()
Code for calculating model metrics
# -*- coding: utf-8 -*- """scores.ipynb Automatically generated by Colaboratory. Original file is located at https://colab.research.google.com/drive/17qoo1U4Iw58GRDDIyCaB2GmbRUg1gTPd """ import heapq import numpy as np import math #Calculate the item top_K score def topK_scores(test, predict, topk, user_count, item_count): PrecisionSum = np.zeros(topk+1) RecallSum = np.zeros(topk+1) F1Sum = np.zeros(topk+1) NDCGSum = np.zeros(topk+1) OneCallSum = np.zeros(topk+1) DCGbest = np.zeros(topk+1) MRRSum = 0 MAPSum = 0 total_test_data_count = 0 for k in range(1, topk+1): DCGbest[k] = DCGbest[k - 1] DCGbest[k] += 1.0 / math.log(k + 1) for i in range(user_count): user_test = [] user_predict = [] test_data_size = 0 for j in range(item_count): if test[i * item_count + j] == 1.0: test_data_size += 1 user_test.append(test[i * item_count + j]) user_predict.append(predict[i * item_count + j]) if test_data_size == 0: continue else: total_test_data_count += 1 predict_max_num_index_list = map(user_predict.index, heapq.nlargest(topk, user_predict)) predict_max_num_index_list = list(predict_max_num_index_list) hit_sum = 0 DCG = np.zeros(topk + 1) DCGbest2 = np.zeros(topk + 1) for k in range(1, topk + 1): DCG[k] = DCG[k - 1] item_id = predict_max_num_index_list[k - 1] if user_test[item_id] == 1: hit_sum += 1 DCG[k] += 1 / math.log(k + 1) # precision, recall, F1, 1-call prec = float(hit_sum / k) rec = float(hit_sum / test_data_size) f1 = 0.0 if prec + rec > 0: f1 = 2 * prec * rec / (prec + rec) PrecisionSum[k] += float(prec) RecallSum[k] += float(rec) F1Sum[k] += float(f1) if test_data_size >= k: DCGbest2[k] = DCGbest[k] else: DCGbest2[k] = DCGbest2[k-1] NDCGSum[k] += DCG[k] / DCGbest2[k] if hit_sum > 0: OneCallSum[k] += 1 else: OneCallSum[k] += 0 # MRR p = 1 for mrr_iter in predict_max_num_index_list: if user_test[mrr_iter] == 1: break p += 1 MRRSum += 1 / float(p) # MAP p = 1 AP = 0.0 hit_before = 0 for mrr_iter in predict_max_num_index_list: if user_test[mrr_iter] == 1: AP += 1 / float(p) * (hit_before + 1) hit_before += 1 p += 1 MAPSum += AP / test_data_size print('MAP:', MAPSum / total_test_data_count) print('MRR:', MRRSum / total_test_data_count) print('Prec@5:', PrecisionSum[4] / total_test_data_count) print('Rec@5:', RecallSum[4] / total_test_data_count) print('F1@5:', F1Sum[4] / total_test_data_count) print('NDCG@5:', NDCGSum[4] / total_test_data_count) print('1-call@5:', OneCallSum[4] / total_test_data_count) return
Original link: https://blog.csdn.net/weixin_46099084/article/details/109011670