pygcn can I adapt the code to multi-label classificaiton?

can I adapt the code to multi-label classificaiton?

Open xypan1232 opened this issue 5 years ago • 12 comments

Hi,

I have a multi-label classification problem, where one node can have multi labels, DO I need change the code for multi-class classification? thanks.

Sep 23 '18 16:09 xypan1232

Replacing softmax cross entropy loss with a sigmoid cross entropy loss should do the job:)

On Sun 23. Sep 2018 at 18:02 Xiaoyong Pan [email protected] wrote:

Hi,

I have a multi-label classification problem, where one node can have multi labels, DO I need change the code for multi-class classification? thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYLwttWfGA_6UUaTJlG6Wj73HZtFmks5ud7B9gaJpZM4W1wj7 .

Sep 23 '18 16:09 tkipf

thanks.

I changed the the GCN model return F.sigmoid(x) and replace the loss function F.nll_loss using loss_train = F.binary_cross_entropy(output[idx_train], labels[idx_train]). is my change correct?

Sep 23 '18 16:09 xypan1232

Sounds correct!

On Sun 23. Sep 2018 at 18:28 Xiaoyong Pan [email protected] wrote:

thanks.

I changed the the GCN model return F.sigmoid(x) and replace the loss function F.nll_loss using loss_train = F.binary_cross_entropy(output[idx_train], labels[idx_train]). is my change correct?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/18#issuecomment-423828841, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYBdnpghPriJCdJr_vKJNTUcsYUNXks5ud7a9gaJpZM4W1wj7 .

Sep 23 '18 16:09 tkipf

hello, have you finish the multi-label problem?

May 09 '19 20:05 Chengmeng94

I need to use GCN with TensorFlow, have you try multi-label problem with Tensorflow? The changes I make as follow: loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=preds, labels=labels) (from metrics.py) correct_prediction = tf.equal(preds, labels). (from metrics.py) return tf.nn.sigmoid(self.outputs). (from models.py)

So is it correct?

May 10 '19 07:05 Chengmeng94

@Chengmeng94 It seems to be correct. Make sure to adapt the accuracy function as well to handle a sigmoid output instead of a softmax output. currently it is just taking the max, as you would expect for a softmax function, so it is just evaluating a single label. This is typically done with some threshold.

Jul 31 '19 19:07 Baukebrenninkmeijer

@anjanaskumar27 My response was to a mult-label situation, rather than a multi-class situation. I think the original code already works for multi-class situations.

Mar 30 '20 08:03 Baukebrenninkmeijer

@Baukebrenninkmeijer sorry, my bad. I meant to ask procedure for multi label. The following are my changes: loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=preds, labels=labels) and correct_prediction = tf.equal(preds, labels) in metrics.py . And use tf.nn.sigmoid(self.outputs) to predict in models.py. As suggested in https://github.com/tkipf/gcn/issues/119 Can you tell me what is wrong here? Or what needs to be updated in accuracy?

Mar 30 '20 18:03 anjanaskumar27

@anjanaskumar27 As far as I can see here, this should be correct except maybe for the metric. I'm not an expert on metrics for multilabel, but I expect just accuracy (which is what you get with the tf.equal) is not the most appropriate, although it still gives you some indication.

Mar 31 '20 19:03 Baukebrenninkmeijer

@Chengmeng94 @tkipf @Baukebrenninkmeijer , to allow multi-label mulit-class classification:

In addition to @Chengmeng94 changes (as follows), loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=preds, labels=labels) in model.py tf.nn.sigmoid(self.outputs) to predict in models.py

I added, logits_prob = tf.nn.sigmoid(preds, name = 'log_prob') predictions = tf.cast(logits_prob > 0.5, tf.float32,name='predictions') correct_prediction = tf.equal(predictions,labels)

Is this correct?

Apr 01 '20 21:04 anjanaskumar27

@anjanaskumar27 tensorflow is not my strongsuit, but a quick google advised the following. Use tf.round() to push the prediction to either 0 or 1. Then you can calculate metrics. This appears to make sense. Let me know if it works out. Please report if correct_prediction is an NxL matrix, where N is the number of samples and L the number of labels. So in short, i think you should do:

correct_prediction = tf.equal(tf.round(tf.nn.sigmoid(self.outputs)), labels)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Apr 01 '20 22:04 Baukebrenninkmeijer

@Baukebrenninkmeijer , thank you for the suggestion. I am new to tensorflow as well, thus finding it little tricky to modify things. For me, the size of correct predictions is 2x1. I may have to fix this? Additionally, wrt to the code in utils.py, line #77, idx_val = range(len(y), len(y)+500), do you have any idea why they add 500 here? My graph with total nodes of 70, seems to fail because of this. Thank you for the help!

Apr 02 '20 00:04 anjanaskumar27

pygcn pygcn copied to clipboard

can I adapt the code to multi-label classificaiton?

pygcn
pygcn copied to clipboard