waymo-open-dataset
waymo-open-dataset copied to clipboard
Tf record data not sorted by sequence number - E2E challenge
Hello! Was trying to access the data in each tf record for the e2e challenge. However, it is difficult to visualize each 20 second snapshot since the data in each tf record doesn't seem to be sorted by sequence number.
Would this mean that in order to get a single 20 second sequence (video) we would have to load in all 315 tf records into the code? Understand that eventually for training that will be needed, however was wondering if there was a misunderstanding of the data sequencing in each tf record?
I am not able to get image frames in sequence. Every frame is different but data.frame.timestamp_micros is showing 0 for every frame. Can anyone help for e2e data ?
Video
Some facts
- The dataset contains 1745 unique scenes. Each scene has a UUID in the name.
- Each frame has a name made of: "UUID-SeqNum"
- The SeqNum tells you the order of the frame in its UUID scene.
- Example:
- UUID is: d6cdf6eb1b7d4a8be6dac71f34e6cdb7
- SeqNum: 164
- This scene has 200 frames.
- SeqNum doesn't always start from 0 or 1.
- The dataset is provided in 316 tfrecord files. But they contain all frames of all scenes completely randomly scrambled.
- I had to reorder them with custom scripts.
- First I've read-wrote each tfrecord sequentially and wrote frames to files named "UUID.tfrecord" in an "unordered" folder.
- Then went over each tfrecord in "unordered" folder, sorted them by SeqNum and resaved in "ordered" folder.
data.frame.timestamp_microsis zero for all frames. You need to infer the order of images in the scene from the SeqNum I've mentioned above from the name of the frame.- Not all scenes have equal number of frames.
- I've sorted all the scenes in the waymo e2e dataset by frame counts here.
- About 15 scenes have frame counts < 190. Rest are between 190 and 230.
I hope this answers all the questions within this issue.
Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset. Could someone verify this?
Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset.
@souravraha As I explained in the post above, that UUID is an unique identifier for each scene/segment/(200~frames run). Not unique for each frame.
Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset.
@souravraha As I explained in the post above, that UUID is an unique identifier for each scene/segment/(200~frames run). Not unique for each frame.
@xmfcx I am talking about the entire frame.context.name ("uuid-seqnum"), not merely the uuid. Here is what I have found: only 415663 (or 54%) of 769849 total frame-names are unique. Am I doing something wrong, or has the addition of new training.tfrecords files (263 in addition to the earlier 315) has introduced inconsistencies? Can someone please help out?
E.g. "003b62820d0e9345eb025de35b046999-009" is found in frame number 80686, as well as 448896!
@souravraha back in my time (just before they released these additional data) It had 316 training tfrecords in total.
Now I checked it again, it has 3 sets of tfrecords:
- 276 test
- 263 training
- 93 validation
And the size of each training tfrecord is about 3.5GB.
🧓 In the older training dataset, they had 316 files each approx. 2.3GB.
This means, the new training dataset is not in addition to the old one.
You should only be using the new one, the old training dataset is now obsolete.
This also explains why you had duplicates, you shouldn't be mixing those 2 together.
I did not download the new dataset but these are my assessments from my observations.
@xmfcx That is plausible. My God, shouldn't they explicitly mention this? A lot of people are simply going to glob with "training" and end up using both sets!
I concur with @xmfcx, the new training set should replace the old one
Is there any particular reason they didnt sort the sequences? This seems completely inconvinient. For e2e driving, espacially vision only, you will need some sort of temporal history. Now everybody somehow needs to sort 1TB of data from a stream only data container...
e2e_blacklist.txt e2e_seq_frames.txt e2e_seq_info.txt
Maybe these are helpful to others. e2e_seq_info: [seq_id, frame_count] e2e_seq_frames: [seq_id, list[frame_ids]] (sorted) e2e_blacklist: str (all these have some kind of gaps in their frame_ids)
