iTransformer icon indicating copy to clipboard operation
iTransformer copied to clipboard

能否提供一下PEMS所有数据集的96步长预测结果

Open bigdata0 opened this issue 1 year ago • 7 comments

image image 在4090单卡环境进行训练,完全使用脚本中相同的训练参数,但是在PEMS08数据集上并没有取得论文中的结果,相差甚远,无论是否修改use_norm image 并且不只是PEMS08数据集,其他PEMS数据集同样存在这样的问题,短步长预测大致符合,48,96步长差距过大。

bigdata0 avatar Jun 22 '24 11:06 bigdata0

我也遇到类似的问题,不过似乎只出现在PEMS07跟PEMS08两个数据集上,并且我的input_len比原文所使用的更长。 35efbb41bbd6159de02c8b28082a903 微信图片_20240723113839

FrankHo-Hwc avatar Jul 23 '24 03:07 FrankHo-Hwc

是的, 1721706312948 我在论文的模型基础上改进,发现大部分的数据集步长都是可以超越的,部分和源码中采用相同的不使用norm方法训练也超越了论文中的结果,唯独08数据集的48和96步长效果比较差,复现之后发现实际也跑不出相同的结果,所以想问一下作者。 @FrankHo-Hwc

bigdata0 avatar Jul 23 '24 03:07 bigdata0

是的, 1721706312948 我在论文的模型基础上改进,发现大部分的数据集步长都是可以超越的,部分和源码中采用相同的不使用norm方法训练也超越了论文中的结果,唯独08数据集的48和96步长效果比较差,复现之后发现实际也跑不出相同的结果,所以想问一下作者。 @FrankHo-Hwc

是的,我看前面的issue说调节学习率跟use_norm项可以让结果好一些。在96那个步长虽然结果有好转,但是还是不如论文的结果,所以还是得作者回应一下。

FrankHo-Hwc avatar Jul 24 '24 03:07 FrankHo-Hwc

@bigdata0 what settings did you use to get the results reported in their paper for each of the PEMS dataset? I ran their script and adjusted use_norm, but still found the values to be quite different even on PEMS03, and PEMS04.

JerayuT avatar Sep 18 '24 07:09 JerayuT

@JerayuT

  1. Check if your itransformer.py is from the Time series Library or this repository. In the version integrated into the Time series Library, use_norm is used by default and cannot be modified through the command line. However, in this version of the repository, use_norm can be modified.
  2. In the same dataset, for a certain step size prediction, you can try setting use_norm to 0, while for another step size prediction, do not set use_norm to 0. For example, when using the first 96 steps to predict 12 steps, do not set use_norm to 0, while predicting 96 steps, set it. I can reproduce most of the results in the paper through the above methods, but there are still some data that cannot be reproduced.

bigdata0 avatar Sep 18 '24 09:09 bigdata0

@bigdata0 I'm currently using this version of the repository. I was also wondering if you adjusted the learning rate or anything else. When I ran the PEMS03 dataset and adjusted the use_norm to 0 for step size prediction {12,24,48,96}. I got improved results from my previous runs, but for step size 48 and 96, the values are still quite far from the results in the paper.

JerayuT avatar Sep 18 '24 17:09 JerayuT

@JerayuT Okay, you can refer to my configuration parameters. python -u run.py --is_training 1 --root_path ./dataset/PEMS/ --data_path PEMS03.npz --model_id PEMS03_96_96 --model iTransformer --data PEMS --features M --seq_len 96 --pred_len 96 --e_layers 4 --enc_in 358 --dec_in 358 --c_out 358 --des 'Exp' --d_model 512 --d_ff 512 --lea rning_rate 0.001 --itr 1 --use_norm 0 Args in experiment: Namespace(is_training=1, model_id='PEMS03_96_96', model='iTransformer', data='PEMS', root_path='./dataset/PEMS/', data_path='PEMS03.npz', features='M', target='OT', freq='h', checkpoints='./checkpoints/', seq_len=96, label_len=48, pred_len=96, enc_in=358, dec_in=358, c_out=358, d_model=512, n_heads=8, e_layers=4, d_layers=1, d_ff=512, moving_avg=25, factor=1, distil=True, dropout=0.1, embed='timeF', activation='gelu', output_attention=False, do_predict=False, num_workers=10, itr=1, train_epochs=10, batch_size=32, patience=3, learning_rate=0.001, des='Exp', loss='MSE', lradj='type1', use_amp=False, use_gpu=True, gpu=0, use_multi_gpu=False, devices='0,1,2,3', exp_name='MTSF', channel_independence=False, inverse=False, class_strategy='projection', target_root_path='./data/electricity/', target_data_path='electricity.csv', efficient_training=False, use_norm=0, partial_start_index=0) Use GPU: cuda:0

start training : PEMS03_96_96_iTransformer_PEMS_M_ft96_sl48_ll96_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0>>>>>>>>>>>>>>>>>>>>>>>>>> train 15533 val 5051 test 5051 iters: 100, epoch: 1 | loss: 0.2663582 speed: 0.0661s/iter; left time: 313.9474s iters: 200, epoch: 1 | loss: 0.2135554 speed: 0.0608s/iter; left time: 282.5925s iters: 300, epoch: 1 | loss: 0.2013143 speed: 0.0572s/iter; left time: 260.3726s iters: 400, epoch: 1 | loss: 0.1978480 speed: 0.0608s/iter; left time: 270.5316s Epoch: 1 cost time: 30.010493993759155 Epoch: 1, Steps: 485 | Train Loss: 0.2256930 Vali Loss: 0.1850095 Test Loss: 0.2440917 Validation loss decreased (inf --> 0.185009). Saving model ... Updating learning rate to 0.001 iters: 100, epoch: 2 | loss: 0.1753062 speed: 1.6842s/iter; left time: 7184.8671s iters: 200, epoch: 2 | loss: 0.1824031 speed: 0.0648s/iter; left time: 269.8704s iters: 300, epoch: 2 | loss: 0.1627843 speed: 0.0622s/iter; left time: 253.0022s iters: 400, epoch: 2 | loss: 0.1522650 speed: 0.0618s/iter; left time: 244.9026s Epoch: 2 cost time: 31.733307361602783 Epoch: 2, Steps: 485 | Train Loss: 0.1630596 Vali Loss: 0.1565045 Test Loss: 0.2239741 Validation loss decreased (0.185009 --> 0.156505). Saving model ... Updating learning rate to 0.0005 iters: 100, epoch: 3 | loss: 0.1193284 speed: 1.7949s/iter; left time: 6786.6307s iters: 200, epoch: 3 | loss: 0.1366769 speed: 0.0643s/iter; left time: 236.7496s iters: 300, epoch: 3 | loss: 0.1275335 speed: 0.0637s/iter; left time: 228.0470s iters: 400, epoch: 3 | loss: 0.1218285 speed: 0.0650s/iter; left time: 226.2647s Epoch: 3 cost time: 32.51231002807617 Epoch: 3, Steps: 485 | Train Loss: 0.1272281 Vali Loss: 0.1349931 Test Loss: 0.2023265 Validation loss decreased (0.156505 --> 0.134993). Saving model ... Updating learning rate to 0.00025 iters: 100, epoch: 4 | loss: 0.1189363 speed: 1.8123s/iter; left time: 5973.2009s iters: 200, epoch: 4 | loss: 0.1217045 speed: 0.0607s/iter; left time: 194.1133s iters: 300, epoch: 4 | loss: 0.1013873 speed: 0.0603s/iter; left time: 186.7840s iters: 400, epoch: 4 | loss: 0.1246471 speed: 0.0626s/iter; left time: 187.6594s Epoch: 4 cost time: 30.79427933692932 Epoch: 4, Steps: 485 | Train Loss: 0.1147907 Vali Loss: 0.1244477 Test Loss: 0.1876199 Validation loss decreased (0.134993 --> 0.124448). Saving model ... Updating learning rate to 0.000125 iters: 100, epoch: 5 | loss: 0.1103279 speed: 1.8241s/iter; left time: 5127.6685s iters: 200, epoch: 5 | loss: 0.1176323 speed: 0.0640s/iter; left time: 173.4984s iters: 300, epoch: 5 | loss: 0.1146321 speed: 0.0663s/iter; left time: 173.0208s iters: 400, epoch: 5 | loss: 0.1095776 speed: 0.0660s/iter; left time: 165.7033s Epoch: 5 cost time: 32.98799157142639 Epoch: 5, Steps: 485 | Train Loss: 0.1096915 Vali Loss: 0.1192572 Test Loss: 0.1804807 Validation loss decreased (0.124448 --> 0.119257). Saving model ... Updating learning rate to 6.25e-05 iters: 100, epoch: 6 | loss: 0.1013640 speed: 1.7467s/iter; left time: 4062.9149s iters: 200, epoch: 6 | loss: 0.1459941 speed: 0.0680s/iter; left time: 151.3634s iters: 300, epoch: 6 | loss: 0.0991267 speed: 0.0720s/iter; left time: 153.0836s iters: 400, epoch: 6 | loss: 0.1169957 speed: 0.0729s/iter; left time: 147.6160s Epoch: 6 cost time: 35.339478731155396 Epoch: 6, Steps: 485 | Train Loss: 0.1069848 Vali Loss: 0.1189177 Test Loss: 0.1792915 Validation loss decreased (0.119257 --> 0.118918). Saving model ... Updating learning rate to 3.125e-05 iters: 100, epoch: 7 | loss: 0.0993732 speed: 1.7634s/iter; left time: 3246.3705s iters: 200, epoch: 7 | loss: 0.1180835 speed: 0.0668s/iter; left time: 116.2247s iters: 300, epoch: 7 | loss: 0.0905823 speed: 0.0642s/iter; left time: 105.3868s iters: 400, epoch: 7 | loss: 0.1043306 speed: 0.0647s/iter; left time: 99.7038s Epoch: 7 cost time: 32.913817405700684 Epoch: 7, Steps: 485 | Train Loss: 0.1055459 Vali Loss: 0.1175509 Test Loss: 0.1766107 Validation loss decreased (0.118918 --> 0.117551). Saving model ... Updating learning rate to 1.5625e-05 iters: 100, epoch: 8 | loss: 0.1248059 speed: 1.7733s/iter; left time: 2404.6006s iters: 200, epoch: 8 | loss: 0.1046225 speed: 0.0637s/iter; left time: 79.9484s iters: 300, epoch: 8 | loss: 0.1021659 speed: 0.0630s/iter; left time: 72.8100s iters: 400, epoch: 8 | loss: 0.1225355 speed: 0.0610s/iter; left time: 64.4164s Epoch: 8 cost time: 31.724972248077393 Epoch: 8, Steps: 485 | Train Loss: 0.1047085 Vali Loss: 0.1169733 Test Loss: 0.1760689 Validation loss decreased (0.117551 --> 0.116973). Saving model ... Updating learning rate to 7.8125e-06 iters: 100, epoch: 9 | loss: 0.1084197 speed: 1.7494s/iter; left time: 1523.6881s iters: 200, epoch: 9 | loss: 0.1012328 speed: 0.0658s/iter; left time: 50.7213s iters: 300, epoch: 9 | loss: 0.1137820 speed: 0.0622s/iter; left time: 41.7486s iters: 400, epoch: 9 | loss: 0.1025800 speed: 0.0630s/iter; left time: 35.9713s Epoch: 9 cost time: 32.4756760597229 Epoch: 9, Steps: 485 | Train Loss: 0.1042602 Vali Loss: 0.1166037 Test Loss: 0.1761539 Validation loss decreased (0.116973 --> 0.116604). Saving model ... Updating learning rate to 3.90625e-06 iters: 100, epoch: 10 | loss: 0.1130980 speed: 1.7666s/iter; left time: 681.9241s iters: 200, epoch: 10 | loss: 0.1034796 speed: 0.0580s/iter; left time: 16.5877s iters: 300, epoch: 10 | loss: 0.0804347 speed: 0.0600s/iter; left time: 11.1661s iters: 400, epoch: 10 | loss: 0.0938171 speed: 0.0621s/iter; left time: 5.3409s Epoch: 10 cost time: 30.659968376159668 Epoch: 10, Steps: 485 | Train Loss: 0.1040058 Vali Loss: 0.1168290 Test Loss: 0.1764464 EarlyStopping counter: 1 out of 3 Updating learning rate to 1.953125e-06 testing : PEMS03_96_96_iTransformer_PEMS_M_ft96_sl48_ll96_pl512_dm8_nh4_el1_dl512_df1_fctimeF_ebTrue_dtExp_projection_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< test 5051 test shape: (5051, 1, 96, 358) (5051, 1, 96, 358) test shape: (5051, 96, 358) (5051, 96, 358) mse:0.17615370452404022, mae:0.2849079966545105

bigdata0 avatar Sep 19 '24 04:09 bigdata0