problem

Open zkLyons opened this issue 11 months ago • 1 comments

class NeuMF(nn.Module): def init(self, args, num_users, num_items): ... def init_weight(self): ... def forward(self, user_indices, item_indices): user_embedding_mlp = self.embedding_user_mlp(user_indices) item_embedding_mlp = self.embedding_item_mlp(item_indices)

    user_embedding_mf = self.embedding_user_mf(user_indices)
    item_embedding_mf = self.embedding_item_mf(item_indices)

    mlp_vector = torch.cat([user_embedding_mlp, item_embedding_mlp], dim=-1)  # the concat latent vector
    mf_vector =torch.mul(user_embedding_mf, item_embedding_mf)

    for idx, _ in enumerate(range(len(self.fc_layers))):
        mlp_vector = self.fc_layers[idx](mlp_vector)

    vector = torch.cat([mlp_vector, mf_vector], dim=-1)
    logits = self.affine_output(vector)
    rating = self.logistic(logits)
    return rating.squeeze()

代码有点问题，在models.py文件中，模型NeuMF，向量输入到mlp层后，并没有通过relu函数，应该加上这一句吧：mlp_vector = torch.nn.ReLU()(mlp_vector)

Jan 05 '25 07:01 zkLyons

Hello, @zkLyons I apologize for the previous confusion. I realize now that I was mistaken about the way to implement the MLP part of the NeuMF model. I incorrectly focused on the individual application of layers using nn.ModuleList, which led to a misunderstanding.

The key point I missed is the proper way to handle the sequence of operations within the Multilayer Perceptron (MLP). In NeuMF, it's crucial that the linear transformations (nn.Linear) are immediately followed by the activation functions (nn.ReLU). This ensures the intended non-linear behavior of the MLP.

nn.ModuleList, while useful for managing a list of modules, doesn't enforce this immediate sequential application. It simply stores the modules, and you have to manually apply them in the forward method, which can lead to incorrect behavior, as I explained before with the analogy of separate operations.

The correct way to implement this is using nn.Sequential. nn.Sequential ensures that the operations are applied in the correct order, with each linear transformation immediately followed by its corresponding activation function. This creates a proper chain of operations, which is essential for the MLP to learn complex non-linear relationships.

self.fc_layers = nn.Sequential() # Not using nn.Modulelist
        for in_size, out_size in zip(layers[:-1], layers[1:]):
            self.fc_layers.append(nn.Linear(in_size, out_size))
            self.fc_layers.append(nn.ReLU())

Jan 05 '25 11:01 pyy0715