dMaSIF icon indicating copy to clipboard operation
dMaSIF copied to clipboard

Why "iface_preds " contain NAN when training dmasif_

Open BingzeWu opened this issue 1 year ago • 7 comments

截屏2023-10-23 13 39 04 It seems that the training is not stable.
I follow the "benchmark_scripts" to retrain the dMaSIF_site_3layer_9A. But when calculate the roc-auc, it raised "Problem with computing roc-auc" and I found that the "iface_preds" contain NAN. Does anyone have similar problem?

BingzeWu avatar Oct 23 '23 05:10 BingzeWu

Same problem, did you solve this? @BingzeWu

YAndrewL avatar Nov 08 '23 15:11 YAndrewL

What do you mean by mini-batch? I've trained this with a batch size of 64, but the model only considers single-batch training, and NaN values still appear after several steps.

Bingze Wu @.***> 于2023年11月10日周五 14:14写道:

No, I found the problem may come from the data preprocess step. When I trained the model on a mini batch, the training was successful. So I found in dMasif convolution step, there is some problem for the “nuv” data. But I don’t how to fix the bug. @.***

发件人: Yufan Andrew Liu @.> 日期: 星期三, 2023年11月8日 23:59 收件人: FreyrS/dMaSIF @.> 抄送: Bingze WU 吴秉泽 @.>, Mention @.> 主题: Re: [FreyrS/dMaSIF] Why "iface_preds " contain NAN when training dmasif_ (Issue #48) 你通常不会收到来自 @.*** 的电子邮件。了解这一点为什么很重要< https://aka.ms/LearnAboutSenderIdentification>

Same problem, did you solve this? @BingzeWuhttps://github.com/BingzeWu

― Reply to this email directly, view it on GitHub< https://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1802185639>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/A2G24DEQKR5UEZBMBUARQ7TYDOT3LAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBSGE4DKNRTHE>.

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1805533104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTEIWBOWVUVXJ7QA7TJFV3YDYEBBAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVGUZTGMJQGQ . You are receiving this because you commented.Message ID: @.***>

-- Yufan Liu, Ph.D. student in computer science,

Computational Bioscience Research Center (CBRC),

King Abdullah University of Science and Technology (KAUST)

yandrewl.github.io

YAndrewL avatar Nov 13 '23 11:11 YAndrewL

Sorry, I mean I trained the model on a sub-dataset(randomly chosen, about 300 data points). And when training on the complete dataset, NaN values still appeared. If you use the trained model to compute the problem date point(where roc-auc problem was raised), you will find in specific geometric convolution layers the Nan value appeared. The internal computation seems to give the wrong value, but I don’t know how to fix it. The convolution relies on different geometric operations, which I am unfamiliar with. 发件人: Yufan Andrew Liu @.> 日期: 星期一, 2023年11月13日 19:32 收件人: FreyrS/dMaSIF @.> 抄送: Bingze WU 吴秉泽 @.>, Mention @.> 主题: Re: [FreyrS/dMaSIF] Why "iface_preds " contain NAN when training dmasif_ (Issue #48) 你通常不会收到来自 @.*** 的电子邮件。了解这一点为什么很重要https://aka.ms/LearnAboutSenderIdentification What do you mean by mini-batch? I've trained this with a batch size of 64, but the model only considers single-batch training, and NaN values still appear after several steps.

Bingze Wu @.***> 于2023年11月10日周五 14:14写道:

No, I found the problem may come from the data preprocess step. When I trained the model on a mini batch, the training was successful. So I found in dMasif convolution step, there is some problem for the “nuv” data. But I don’t how to fix the bug. @.***

发件人: Yufan Andrew Liu @.> 日期: 星期三, 2023年11月8日 23:59 收件人: FreyrS/dMaSIF @.> 抄送: Bingze WU 吴秉泽 @.>, Mention @.> 主题: Re: [FreyrS/dMaSIF] Why "iface_preds " contain NAN when training dmasif_ (Issue #48) 你通常不会收到来自 @.*** 的电子邮件。了解这一点为什么很重要< https://aka.ms/LearnAboutSenderIdentification>

Same problem, did you solve this? @BingzeWuhttps://github.com/BingzeWu

D Reply to this email directly, view it on GitHub< https://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1802185639>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/A2G24DEQKR5UEZBMBUARQ7TYDOT3LAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBSGE4DKNRTHE>.

You are receiving this because you were mentioned.Message ID: @.***>

― Reply to this email directly, view it on GitHub https://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1805533104, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTEIWBOWVUVXJ7QA7TJFV3YDYEBBAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBVGUZTGMJQGQ . You are receiving this because you commented.Message ID: @.***>

-- Yufan Liu, Ph.D. student in computer science,

Computational Bioscience Research Center (CBRC),

King Abdullah University of Science and Technology (KAUST)

yandrewl.github.io

― Reply to this email directly, view it on GitHubhttps://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1807984968, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A2G24DFYQ67MUK7PNSRN7CDYEIANJAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBXHE4DIOJWHA. You are receiving this because you were mentioned.Message ID: @.***>

BingzeWu avatar Nov 14 '23 08:11 BingzeWu

I also found that it has issues with input data, batch size and its hyperparameters. Struggled for 1 week for running without nan but fails. Maybe such network only works for their own PPI data which passed through prescision regulation. Given up for understanding and debug it.

Xinheng-He avatar Nov 28 '23 07:11 Xinheng-He

@YAndrewL Same problem, did you solve this?

orange2350 avatar Dec 06 '23 02:12 orange2350

Hi Zhiyi, not yet, but you may find the NaN in the input feature part, and mask then with average or some constant to start the training, unfortunately, I did not get the training results described in the paper.

Chen Zhiyi @.***> 于2023年12月6日周三 05:44写道:

@YAndrewL https://github.com/YAndrewL Same problem, did you solve this?

— Reply to this email directly, view it on GitHub https://github.com/FreyrS/dMaSIF/issues/48#issuecomment-1841995741, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTEIWAQUAMMBJH7ACXOSHLYH7LYFAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBRHE4TKNZUGE . You are receiving this because you were mentioned.Message ID: @.***>

-- Yufan Liu, Ph.D. student in computer science,

Computational Bioscience Research Center (CBRC),

King Abdullah University of Science and Technology (KAUST)

yandrewl.github.io

YAndrewL avatar Dec 06 '23 07:12 YAndrewL

Hi Zhiyi, not yet, but you may find the NaN in the input feature part, and mask then with average or some constant to start the training, unfortunately, I did not get the training results described in the paper. Chen Zhiyi @.> 于2023年12月6日周三 05:44写道: @YAndrewL https://github.com/YAndrewL Same problem, did you solve this? — Reply to this email directly, view it on GitHub <#48 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOTEIWAQUAMMBJH7ACXOSHLYH7LYFAVCNFSM6AAAAAA6LQCAYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBRHE4TKNZUGE . You are receiving this because you were mentioned.Message ID: @.> -- Yufan Liu, Ph.D. student in computer science, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST) yandrewl.github.io

The results I get from reproducing the dMASIF is not the same as the paper used to evaluate it either.thanks

orange2350 avatar Dec 06 '23 08:12 orange2350