waymo-open-dataset
waymo-open-dataset copied to clipboard
Folder structure of scenarios in Waymo Open Dataset
The scenario
folder for the Waymo Motion Dataset looks like this:
For the training and validation sets, the website (https://waymo.com/open/data/motion/) says "These segments are further broken into 9 second windows (1 second of history and 8 seconds of future data) with varying overlap." What is meant by history and future data – and does this distinction matter for training?
Furthermore, what are the testing_interactive
and validation_interactive
folders? How are they different from the testing
and validation
folders?
Lastly, I notice there's a training_20s
folder. Here, I assume each TFrecord file corresponds to a 20 second segment as opposed to a 9 second segment for the TFRecords in training
and validation
. So how come training_20s
, training
, and validation
each have 1000 TFRecords? I would expect training
and validation
to have a little more than (20/9) double the number of TFrecords, no?
Thanks for the help!
Hi, As for history and future data, models are intended to take 1s of history as input and output 8 seconds of future prediction data. As such, the training data is broken into 1 second history, 1 current time step, and 8 seconds of future data.
The interactive dataset splits are for use with the interaction challenge described here.
As for the number of files, each tfrecord file contains many examples (the tfrecord format provides for serial reading of examples from a single file). They are broken into smaller shards for processing in parallel. The training sets consist of 1000 file shards each while the validation and test sets consist of 150 file shards each. Again each of the file shards contains many examples - there are hundreds of thousands of total examples.
Please let me know if you have further questions.