waymo-open-dataset icon indicating copy to clipboard operation
waymo-open-dataset copied to clipboard

Issues on Test Data Release and Rater Feedback Labels for Waymo E2E Dataset

Open Zewei-Zhou opened this issue 7 months ago • 10 comments

Hi, Waymo Team,

we have a few questions about the waymo e2e dataset:

- Test Data and Submission:

When does your team plan to release the test split of the Waymo E2E Driving dataset? When is your submission platform open? And we can submit our test results.

- Rater Feedback Labels (Training Set):

How can we access the rater feedback labels for the training set? If not currently released, could you tell us when you plan to make them available?

Zewei-Zhou avatar Apr 18 '25 23:04 Zewei-Zhou

It doesn't seem like we'll be getting rater feedback labels for the train set. Although, I also originally thought this to be the case

rdesc avatar Apr 20 '25 16:04 rdesc

@rdesc have you successfully imported the dataset? If so, what did you do?

Azealoo avatar Apr 20 '25 20:04 Azealoo

I have yes. If you're asking about downloading the tf files I got them from here https://waymo.com/open/challenges/2025/e2e-driving/

If you're asking how to load the dataset in code, I followed the suggestion here https://github.com/waymo-research/waymo-open-dataset/issues/918#issuecomment-2788265949 where I generated a pickle file for each of the 1745 scenarios and now I just use the pickle files for training rather than the tf records.

rdesc avatar Apr 21 '25 00:04 rdesc

+1 on this question:

When is your submission platform open?

rdesc avatar Apr 22 '25 23:04 rdesc

I have yes. If you're asking about downloading the tf files I got them from here https://waymo.com/open/challenges/2025/e2e-driving/

If you're asking how to load the dataset in code, I followed the suggestion here #918 (comment) where I generated a pickle file for each of the 1745 scenarios and now I just use the pickle files for training rather than the tf records.

@rdesc Is there a reading benefit for using pickles instead of TFRecordDataset? I am trying to wrap the TFRecordDataset using a Pytorch IterableDataset. Will I be better off by converting to pickles first?

souravraha avatar Apr 28 '25 09:04 souravraha

To visualize the videos (e.g. https://www.youtube.com/watch?v=qF0qhMhonrA from https://github.com/waymo-research/waymo-open-dataset/issues/924#issuecomment-2815536215), having the pickle files (i.e. 1 pickle file per scenario) is pretty necessary. Otherwise I think it takes way too long to go through the whole TFRecordDataset to find all the frames of a particular scenario.

In terms of training, I think the TFRecordDataset is still the right data type since loading the pickles into memory is quite slow (roughly 1 Gb per scenerio), however I'm still trying to figure out a good solution for using TFRecordDataset and having clusters of ordered frames

rdesc avatar Apr 28 '25 14:04 rdesc

@rdesc, how many images, for example, do you get per scene for the front camera? I am getting fewer images than reported. I described it here

dukevah avatar May 02 '25 07:05 dukevah

We have already released the validation and test set. The rater feedback labels are provided in the validation set.

DerrickXuNu avatar May 04 '25 18:05 DerrickXuNu

@DerrickXuNu Cool! Thanks for your support.

Zewei-Zhou avatar May 04 '25 20:05 Zewei-Zhou

@DerrickXuNu I get 3 images for the front, 2 for the side, and 3 for the rear cameras, per frame. Per scene / scenario, I get an average of 200 frames but some have more (230) and some have less (100).

rdesc avatar May 04 '25 20:05 rdesc