social-lstm icon indicating copy to clipboard operation
social-lstm copied to clipboard

Getting key error when training on external dataset

Open tristankpka opened this issue 4 years ago • 5 comments

Using the same format as you. I get errors when using train (default params) on an external dataset previously formated in the same way you suggest (dataset is attached) 07_tracks.txt

Creating pre-processed validation data from raw data
Now processing:  ./data/validation/highd/07_tracks.txt
Creating pre-processed training data from raw data
Now processing:  ./data/train/highd/07_tracks.txt
Loading train or test dataset:  ./data/train/trajectories_train.cpkl
Sequence size(frame) ------> 20
One batch size (frame)--->- 100
Training data from training dataset(name, # frame, #sequence)-->  07_tracks.txt : 40282 : 2014
Validation data from training dataset(name, # frame, #sequence)-->  07_tracks.txt : 0 : 0
Total number of training batches: 402
Total number of validation batches: 0
****************Training epoch beginning******************
0/12060 (epoch 0), train_loss = 18.527, time/batch = 0.945
1/12060 (epoch 0), train_loss = 6.513, time/batch = 0.877
Traceback (most recent call last):
  File "train.py", line 626, in <module>
    main()
  File "train.py", line 94, in main
    train(args)
  File "train.py", line 218, in train
    target_id_values = x_seq[0][lookup_seq[target_id], 0:2]
KeyError: 14

tristankpka avatar Jun 22 '20 16:06 tristankpka

The error was caused by a mismatch between the lstm and data sequences length. That is to say if you're trainig a LSTM of X sequences length, your data files must be composed only of sequences that are X long. It could be good to include this in the data format description file.

tristankpka avatar Jun 23 '20 09:06 tristankpka

We are facing the same issue with our custom dataset. Can you please guide us through the steps you undertook to resolve this?

sammyjojo9 avatar May 15 '21 19:05 sammyjojo9

@sammyjojo9 It's been a while since I used this implementation but I remember that you should ensure that your custom dataset contain lot's of sequences that match your desired sequences length parameter during learning. I got the error above by setting the sequence length parameter too high during learning and when the train and test sets were made (randomly) I didn't have enough long-enough sequences to populate the test set (see the 0 size in the error).

tristankpka avatar May 17 '21 14:05 tristankpka

Thanks alot!! we still had some issues regarding the dateset using pixel values that need to be converted into real coordinates using homographic matrix.Could you shed some light on figuring out the matrix and the values in the matrix?

sammyjojo9 avatar May 17 '21 16:05 sammyjojo9

@sammyjojo9 The homography matrix should be estimated using DLT methods as explained in https://docs.opencv.org/master/d9/dab/tutorial_homography.html#lecture_16 Real coordinates can be found using the homography only by considering planar (2D) coordinates. A quick way to convert pixels coord. to real coord. is given in the link above.

tristankpka avatar May 26 '21 09:05 tristankpka