User-base/Item-base实现
User-Base: 计算similarity matrix of user-user using cosine similarity
然后通过similarity matrix between user- user 来user vector之间的weighted sum来计算rating
import pandas as pd
import numpy as np
def userCF(users, items):
num_user = len(users.keys())
num_item = len(items.keys())
sim_matrix_user = pd.DataFrame(np.zeros((num_user,num_user)), index=users.keys(), columns=users.keys())
for i in range(num_user):
for j in range(i, num_user):
intersec_items = []
dot_prod = 0
num_ui,num_uj = 0,0
ui = sim_matrix_user.columns[i]
uj = sim_matrix_user.columns[j]
for item in items.keys():
# using cosine similarity
if item in users[ui].keys():
num_ui += users[ui][item]**2
if item in users[uj].keys():
num_uj += users[uj][item]**2
if item in users[ui].keys() and item in users[uj].keys():
dot_prod += (users[ui][item] * users[uj][item])
similarity = dot_prod/(np.sqrt(num_uj) * np.sqrt(num_ui))
sim_matrix_user[ui][uj] = similarity
sim_matrix_user[uj][ui] = similarity
return sim_matrix_user
def user_Recommend(user, sim_matrix_user, users, items, k):
# select top K similar users for selection
similar_users = sim_matrix_user[user].sort_values(ascending = False)
topk_users = similar_users[1:1+k]
# dataframe storing result
rating_df = pd.DataFrame()
user_rating = pd.DataFrame(users)
w_sum = 0
# find weighted sum of rating between input user and all item
for u in topk_users.keys():
rating_df = rating_df.append(topk_users[u]* user_rating[u])
rating_df = (rating_df.sum()/sum(topk_users)).sort_values(ascending = False)
return topk_users, rating_df
另外一种写法:
计算 user-user similarity
考虑到存在着这样一种用户,他与很多商品都发生过交互,但他发生交互可能是因为他是做这个生意的或者怎么,并不是因为这些商品之间存在某种相似性,所以这里我们同样可以引入Inverse User Frequence来对物品相似度进行加权,从而避免这种用户的影响。 这里的similariry用了

计算topK ranking
Item-Base: 计算similarity matrix of item-item using cosine similarity
然后通过similarity matrix between item- item 来计算item vector之间的weighted sum以及每个user对这个item的rating
另外一种写法:
计算item-item similarity
计算item的 ranking和选择topK个召回
Reference
https://blog.csdn.net/sinat_22594309/article/details/86420207
Last updated
Was this helpful?