NRI-MD icon indicating copy to clipboard operation
NRI-MD copied to clipboard

Subject: No Improvement in Validation Accuracy

Open vas2201 opened this issue 1 year ago • 1 comments

Hi

I'm reaching out to request your help to resolve with an issue we're encountering in the current project, specifically regarding the validation accuracy (acc_val) in our model's performance, which is not showing any improvement.

We are using a model with the following configuration: Number of residues: 98 Edge types: 4 Dimensions: 6 Timesteps: 50 Prediction steps: 1 Batch size: 1 Learning rate: 0.0005 Encoder/Decoder: MLP for encoder and RNN for decoder Dynamic graph, factor graph, and prior usage are enabled Issue Description: Throughout the training process over 30 epochs (in total 500), despite seeing significant changes in the nll_train, kl_train, mse_train, and acc_train metrics, the validation accuracy (acc_val) remains at 0.0000000000. This indicates that the model is not correctly validating against the test dataset.

Epoch: 0000 nll_train: 93185.7750843593 kl_train: 195.2919983183 mse_train: 0.0316958412 acc_train: 0.0915887710 nll_val: 104181.9461495536 kl_val: -0.2983094061 mse_val: 0.0354360349 acc_val: 0.0000000000 time: 866.2107s Best model so far, saving... Epoch: 0001 nll_train: 40856.0143786839 kl_train: 204.9207121985 mse_train: 0.0138966030 acc_train: 0.0893383126 nll_val: 75632.8662109375 kl_val: -0.1133307992 mse_val: 0.0257254635 acc_val: 0.0000000000 time: 713.4318s Best model so far, saving... Epoch: 0002 nll_train: 28308.7293363299 kl_train: 198.1665186201 mse_train: 0.0096288194 acc_train: 0.2112199814 nll_val: 44862.8242187500 kl_val: -0.1718699533 mse_val: 0.0152594636 acc_val: 0.0000000000 time: 443.2562s Best model so far, saving... Epoch: 0003 nll_train: 23041.0194764818 kl_train: 187.3990933555 mse_train: 0.0078370812 acc_train: 0.3721070903 nll_val: 60220.3454066685 kl_val: -0.1325645934 mse_val: 0.0204831097 acc_val: 0.0000000000 time: 740.4529s Epoch: 0004 nll_train: 20646.6544189453 kl_train: 177.2104825974 mse_train: 0.0070226713 acc_train: 0.4466821707 nll_val: 68130.9513762338 kl_val: -0.4891080290 mse_val: 0.0231737920 acc_val: 0.0000000000 time: 500.6341s Epoch: 0005 nll_train: 21552.2414591653 kl_train: 155.8838998250 mse_train: 0.0073306941 acc_train: 0.5703503051 nll_val: 27796.2101353237 kl_val: -0.1425452123 mse_val: 0.0094544931 acc_val: 0.0000000000 time: 733.5250s Best model so far, saving... Epoch: 0006 nll_train: 14991.4931028911 kl_train: 163.2396456855 mse_train: 0.0050991474 acc_train: 0.5408952241 nll_val: 59636.6536690848 kl_val: -0.1362025940 mse_val: 0.0202845746 acc_val: 0.0000000000 time: 597.9917s Epoch: 0007 nll_train: 14380.3693662371 kl_train: 152.6876725469 mse_train: 0.0048912819 acc_train: 0.5902738120 nll_val: 14925.1580543518 kl_val: -0.1527265439 mse_val: 0.0050765842 acc_val: 0.0000000000 time: 405.4773s Best model so far, saving... Epoch: 0008 nll_train: 10603.3555524009 kl_train: 154.0793498584 mse_train: 0.0036065834 acc_train: 0.5898699318 nll_val: 26743.8529139927 kl_val: -24.2923367723 mse_val: 0.0090965485 acc_val: 0.0000000000 time: 981.2972s Epoch: 0009 nll_train: 10464.2849783216 kl_train: 152.4066004072 mse_train: 0.0035592805 acc_train: 0.6285898380 nll_val: 81366.5099051339 kl_val: -0.1153421539 mse_val: 0.0276756824 acc_val: 0.0000000000 time: 386.0460s Epoch: 0010 nll_train: 12756.5482781274 kl_train: 149.8410109111 mse_train: 0.0043389617 acc_train: 0.5913539569 nll_val: 24450.8945312500 kl_val: -0.1167528553 mse_val: 0.0083166306 acc_val: 0.0000000000 time: 388.5709s Epoch: 0011 nll_train: 10343.5727191653 kl_train: 145.4063068117 mse_train: 0.0035182219 acc_train: 0.5680622765 nll_val: 46710.4846540179 kl_val: -0.1308913601 mse_val: 0.0158879196 acc_val: 0.0000000000 time: 937.4590s Epoch: 0012 nll_train: 12361.6497343608 kl_train: 145.2983415467 mse_train: 0.0042046425 acc_train: 0.6046425566 nll_val: 47477.3231724330 kl_val: -0.1025856223 mse_val: 0.0161487487 acc_val: 0.0000000000 time: 485.8568s Epoch: 0013 nll_train: 11850.1992909568 kl_train: 146.6304704802 mse_train: 0.0040306801 acc_train: 0.5974891798 nll_val: 25451.9597516741 kl_val: -0.0844267165 mse_val: 0.0086571290 acc_val: 0.0000000000 time: 437.3189s Epoch: 0014 nll_train: 12070.4967947006 kl_train: 142.0481354850 mse_train: 0.0041056111 acc_train: 0.6370844730 nll_val: 85820.4837472098 kl_val: -0.0702253677 mse_val: 0.0291906401 acc_val: 0.0000000000 time: 774.2346s Epoch: 0015 nll_train: 14697.3387419837 kl_train: 125.9110430990 mse_train: 0.0049990949 acc_train: 0.6360926182 nll_val: 60153.0705915179 kl_val: -0.1054519279 mse_val: 0.0204602274 acc_val: 0.0000000000 time: 511.0689s Epoch: 0016 nll_train: 19818.1795234680 kl_train: 124.6392605645 mse_train: 0.0067408772 acc_train: 0.6660285985 nll_val: 75138.3114536830 kl_val: -0.0228514423 mse_val: 0.0255572480 acc_val: 0.0000000000 time: 463.1343s Epoch: 0017 nll_train: 11727.2011769159 kl_train: 121.3601091589 mse_train: 0.0039888440 acc_train: 0.6728569926 nll_val: 66181.2564871652 kl_val: -0.0667972792 mse_val: 0.0225106296 acc_val: 0.0000000000 time: 903.3679s Epoch: 0018 nll_train: 9472.3040493556 kl_train: 113.7943201065 mse_train: 0.0032218720 acc_train: 0.6828431667 nll_val: 25592.9206891741 kl_val: -0.0853543440 mse_val: 0.0087050749 acc_val: 0.0000000000 time: 478.9071s Epoch: 0019 nll_train: 9525.3769065312 kl_train: 109.1455590725 mse_train: 0.0032399239 acc_train: 0.6851894292 nll_val: 28269.2817731585 kl_val: -32.3087573009 mse_val: 0.0096154019 acc_val: 0.0000000000 time: 586.1777s Epoch: 0020 nll_train: 8881.7724550792 kl_train: 107.0338555234 mse_train: 0.0030210109 acc_train: 0.6891981756 nll_val: 41667.0500837054 kl_val: -0.0795855493 mse_val: 0.0141724662 acc_val: 0.0000000000 time: 791.1620s Epoch: 0021 nll_train: 10563.2213559832 kl_train: 105.0309611389 mse_train: 0.0035929323 acc_train: 0.6962144210 nll_val: 20989.0607212612 kl_val: -0.0574653355 mse_val: 0.0071391361 acc_val: 0.0000000000 time: 430.2279s Epoch: 0022 nll_train: 7643.8489691871 kl_train: 101.9360435690 mse_train: 0.0025999486 acc_train: 0.7003396351 nll_val: 23780.3366699219 kl_val: -4.3749750680 mse_val: 0.0080885497 acc_val: 0.0000000000 time: 421.6247s Epoch: 0023 nll_train: 6881.1352819715 kl_train: 101.7752324002 mse_train: 0.0023405222 acc_train: 0.6987203571 nll_val: 20917.7909109933 kl_val: -0.0962230675 mse_val: 0.0071148943 acc_val: 0.0000000000 time: 1056.8995s Epoch: 0024 nll_train: 6526.3531938280 kl_train: 103.0320355892 mse_train: 0.0022198479 acc_train: 0.6942814313 nll_val: 22670.8121861049 kl_val: -0.0894306706 mse_val: 0.0077111607 acc_val: 0.0000000000 time: 533.5991s Epoch: 0025 nll_train: 6397.3086107799 kl_train: 105.1447147812 mse_train: 0.0021759552 acc_train: 0.6876258604 nll_val: 13848.4286411830 kl_val: -0.0812793788 mse_val: 0.0047103497 acc_val: 0.0000000000 time: 884.6829s Best model so far, saving... Epoch: 0026 nll_train: 6180.4524361747 kl_train: 104.9708604472 mse_train: 0.0021021946 acc_train: 0.6880015629 nll_val: 32681.5507812500 kl_val: -0.1826013448 mse_val: 0.0111161736 acc_val: 0.0000000000 time: 479.3236s Epoch: 0027 nll_train: 5294.6343832016 kl_train: 104.7728396314 mse_train: 0.0018008960 acc_train: 0.6865419585 nll_val: 13213.0326450893 kl_val: -0.0821504131 mse_val: 0.0044942286 acc_val: 0.0000000000 time: 508.0394s Best model so far, saving... Epoch: 0028 nll_train: 5129.1191129684 kl_train: 105.5634074892 mse_train: 0.0017445983 acc_train: 0.6834799826 nll_val: 21072.2561907087 kl_val: -0.1658951991 mse_val: 0.0071674339 acc_val: 0.0000000000 time: 918.2695s Epoch: 0029 nll_train: 4824.5187172209 kl_train: 106.2803972449 mse_train: 0.0016409927 acc_train: 0.6795125635 nll_val: 12688.9289376395 kl_val: -0.0989101932 mse_val: 0.0043159621 acc_val: 0.0000000000 time: 542.2904s Best model so far, saving...

vas2201:~/workflow2024/NRI-MD-main$ python3 npy_data_pattren.py File: data/edges.npy Shape: (98, 98, 98) Size: 941192 Dtype: float64

File: data/edges_test.npy Shape: (98, 98, 98) Size: 941192 Dtype: float64

File: data/edges_valid.npy Shape: (98, 98, 98) Size: 941192 Dtype: float64

File: data/features.npy Shape: (98, 50, 6, 98) Size: 2881200 Dtype: float64

File: data/features_test.npy Shape: (98, 50, 6, 98) Size: 2881200 Dtype: float64

File: data/features_valid.npy Shape: (98, 50, 6, 98) Size: 2881200 Dtype: float64

vas2201 avatar Feb 13 '24 21:02 vas2201

Also, I am running the ca_1.pdb test file and getting negative values from KL Validation for each epoch. I observed that you had a similar issue. Shouldn't KL Divergence values be positive?

chagas98 avatar Apr 05 '24 17:04 chagas98