waymo-open-dataset icon indicating copy to clipboard operation
waymo-open-dataset copied to clipboard

Tf record data not sorted by sequence number - E2E challenge

Open sjhalani7 opened this issue 7 months ago • 8 comments
trafficstars

Hello! Was trying to access the data in each tf record for the e2e challenge. However, it is difficult to visualize each 20 second snapshot since the data in each tf record doesn't seem to be sorted by sequence number.

Would this mean that in order to get a single 20 second sequence (video) we would have to load in all 315 tf records into the code? Understand that eventually for training that will be needed, however was wondering if there was a misunderstanding of the data sequencing in each tf record?

sjhalani7 avatar Apr 09 '25 19:04 sjhalani7

I am not able to get image frames in sequence. Every frame is different but data.frame.timestamp_micros is showing 0 for every frame. Can anyone help for e2e data ?

ParvezAlam123 avatar Apr 14 '25 09:04 ParvezAlam123

Video

Watch the video

Some facts

  • The dataset contains 1745 unique scenes. Each scene has a UUID in the name.
  • Each frame has a name made of: "UUID-SeqNum"
    • The SeqNum tells you the order of the frame in its UUID scene.
    • Example:
      • UUID is: d6cdf6eb1b7d4a8be6dac71f34e6cdb7
      • SeqNum: 164
      • This scene has 200 frames.
    • SeqNum doesn't always start from 0 or 1.
  • The dataset is provided in 316 tfrecord files. But they contain all frames of all scenes completely randomly scrambled.
    • I had to reorder them with custom scripts.
    • First I've read-wrote each tfrecord sequentially and wrote frames to files named "UUID.tfrecord" in an "unordered" folder.
    • Then went over each tfrecord in "unordered" folder, sorted them by SeqNum and resaved in "ordered" folder.
  • data.frame.timestamp_micros is zero for all frames. You need to infer the order of images in the scene from the SeqNum I've mentioned above from the name of the frame.
  • Not all scenes have equal number of frames.
    • I've sorted all the scenes in the waymo e2e dataset by frame counts here.
    • About 15 scenes have frame counts < 190. Rest are between 190 and 230.

I hope this answers all the questions within this issue.

xmfcx avatar Apr 18 '25 14:04 xmfcx

Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset. Could someone verify this?

souravraha avatar May 07 '25 06:05 souravraha

Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset.

@souravraha As I explained in the post above, that UUID is an unique identifier for each scene/segment/(200~frames run). Not unique for each frame.

xmfcx avatar May 07 '25 07:05 xmfcx

Is the frame.context.name a unique identifier for each frame? I noticed some frames repeat when iterating over the tfrecorddataset.

@souravraha As I explained in the post above, that UUID is an unique identifier for each scene/segment/(200~frames run). Not unique for each frame.

@xmfcx I am talking about the entire frame.context.name ("uuid-seqnum"), not merely the uuid. Here is what I have found: only 415663 (or 54%) of 769849 total frame-names are unique. Am I doing something wrong, or has the addition of new training.tfrecords files (263 in addition to the earlier 315) has introduced inconsistencies? Can someone please help out?

E.g. "003b62820d0e9345eb025de35b046999-009" is found in frame number 80686, as well as 448896!

souravraha avatar May 07 '25 08:05 souravraha

@souravraha back in my time (just before they released these additional data) It had 316 training tfrecords in total.

Now I checked it again, it has 3 sets of tfrecords:

  • 276 test
  • 263 training
  • 93 validation

And the size of each training tfrecord is about 3.5GB.

🧓 In the older training dataset, they had 316 files each approx. 2.3GB.

This means, the new training dataset is not in addition to the old one.

You should only be using the new one, the old training dataset is now obsolete.

This also explains why you had duplicates, you shouldn't be mixing those 2 together.

I did not download the new dataset but these are my assessments from my observations.

xmfcx avatar May 07 '25 20:05 xmfcx

@xmfcx That is plausible. My God, shouldn't they explicitly mention this? A lot of people are simply going to glob with "training" and end up using both sets!

souravraha avatar May 08 '25 03:05 souravraha

I concur with @xmfcx, the new training set should replace the old one

rdesc avatar May 08 '25 03:05 rdesc

Is there any particular reason they didnt sort the sequences? This seems completely inconvinient. For e2e driving, espacially vision only, you will need some sort of temporal history. Now everybody somehow needs to sort 1TB of data from a stream only data container...

aeon0 avatar Sep 29 '25 04:09 aeon0

e2e_blacklist.txt e2e_seq_frames.txt e2e_seq_info.txt

Maybe these are helpful to others. e2e_seq_info: [seq_id, frame_count] e2e_seq_frames: [seq_id, list[frame_ids]] (sorted) e2e_blacklist: str (all these have some kind of gaps in their frame_ids)

aeon0 avatar Sep 29 '25 05:09 aeon0