CollaborativeVAE icon indicating copy to clipboard operation
CollaborativeVAE copied to clipboard

Preprocess Steps

Open rahimentezari opened this issue 7 years ago • 2 comments

I am using your code for my dataset. Each article is now represented with a bag-of-words histogram vector. What does this next step mean? normalized over the maximum occurrences of each word in all articles.

rahimentezari avatar Jan 27 '18 11:01 rahimentezari

say, there are two documents in the dataset and bow represented as [[0, 10, 1, 5], [2, 3, 6, 10]]. Normalize over the maximum coocurrences for each word, then the two vectors become [[0/2, 10/10, 1/6, 5/10], [2/2, 3/10, 6/6, 10/10 ]], just to make each value between [0,1].

eelxpeng avatar Jan 28 '18 09:01 eelxpeng

Tnx for your swift answer. So rows of mul_nor.m is the items and columns correspond to the BOW with values between [0,1]? what is items.dat?

rahimentezari avatar Jan 28 '18 12:01 rahimentezari