MMSSL icon indicating copy to clipboard operation
MMSSL copied to clipboard

embedding question from filter text_feat.npy and image_feat

Open WhuanY opened this issue 9 months ago • 0 comments

Thanks for your wonderful contribution for embedding netflix item data.

In python, when I load your Netflix data, the text_feat.npy and image_feat.npy both represents a numpy adarray. To be more exact:

text_feat = np.load('text_feat.npy')
image_feat = np.load('image_feat.npy')

print(text_feat.shape) # -> 17366 * 768 
print(image_feat.shape) # -> 17366 * 512 

May I ask if it is true that the organization of text_feat and image_feat are by the sequence of, for each row, item 1, [embedding 1]; item 2,[embedding 2]; # as itemid sequence ... or item 9733, [embedding 9733]; item 14147, [embedding 14147]; # as the sequence from item_attribute.csv ...

Thanks! I am carrying out embedding_based i2i similarity recommendation.

WhuanY avatar May 27 '24 01:05 WhuanY