mxnet-notebooks icon indicating copy to clipboard operation
mxnet-notebooks copied to clipboard

deep matrix factorization question

Open jmschrei opened this issue 8 years ago • 1 comments

The code currently looks like this:

def get_one_layer_mlp(hidden, k):
    # input
    user = mx.symbol.Variable('user')
    item = mx.symbol.Variable('item')
    score = mx.symbol.Variable('score')
    # user latent features
    user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)
    user = mx.symbol.Activation(data = user, act_type="relu")
    user = mx.symbol.FullyConnected(data = user, num_hidden = hidden)
    # item latent features
    item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)
    item = mx.symbol.Activation(data = item, act_type="relu")
    item = mx.symbol.FullyConnected(data = item, num_hidden = hidden)
    # predict by the inner product
    pred = user * item
    pred = mx.symbol.sum_axis(data = pred, axis = 1)
    pred = mx.symbol.Flatten(data = pred)
    # loss layer
    pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
    return pred

My understanding is that the embedding layer should be able to learn anything that having a single dense layer on top of it could learn, since it can be basically anything. I had thought a deep matrix factorization would look something more like this:

def get_one_layer_mlp(hidden, k):
    # input
    user = mx.symbol.Variable('user')
    item = mx.symbol.Variable('item')
    score = mx.symbol.Variable('score')
    # user latent features
    user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)

    # item latent features
    item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)

    # predict by the inner product
    pred = mx.symbol.Concat([user, item])
    pred = mx.symbol.FullyConnected(data = pred, num_hidden = hidden)
    pred = mx.symbol.Activation(data = pred, act_type="relu")
    pred = mx.symbol.FullyConnected(data = pred, num_hidden = 1)

    # loss layer
    pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
    return pred

Basically, the layers should take a concatenation of the latent variables and have layers on top of that, instead of having layers on top of the embedding layers.

jmschrei avatar Feb 23 '17 21:02 jmschrei

the second one also makes sense. but you can think the first one is special case of the second one, the former uses a fullyc on both user and pred with a block structure.

mli avatar Mar 03 '17 19:03 mli