waymo-open-dataset Missing Map Features in Testing Scenario Proto Dataset

Hello,

While trying to process map features from the Scenario Proto dataset, I found that some test scenarios have no map features. Here is the list of these scenarios:

No.	Scenario ID
1	a869787ec83d5c8a
2	8738e8c0200056fe
3	9465fb15b456855a
4	4c99903f949e8100
5	ff6686b0e98d66ae
6	9036b3f956b09fc0
7	981f5c4f61505759
8	38e2986c8098692f
9	ab7571715a1d5193

The environment I am using:

python==3.10.13
conda==23.11.0
pip==23.3.2
numpy==1.21.5
pandas==1.5.3
tensorflow==2.11.0
torch==2.1.0+cu118
waymo-open-dataset-tf-2-11-0 1.6.1

The code to reproduce the issue:

from pathlib import Path

import tensorflow as tf
from tqdm import tqdm

DATA_ROOT = str(Path("../data/womd/raw/scenario/").resolve())
TRAIN_FILES = os.path.join(DATA_ROOT, "training", "training.tfrecord*") 
VALID_FILES = os.path.join(DATA_ROOT, "validation", "validation.tfrecord*")
TEST_FILES = os.path.join(DATA_ROOT, "testing", "testing.tfrecord*")

filenames = tf.io.matching_files(TEST_FILES)
dataset = tf.data.TFRecordDataset(filenames)
for data in tqdm(
    dataset.as_numpy_iterator(),
    total=NUM_DATA_MAP[split],
    desc=f"Traversing {split} data",
):
    scenario = scenario_pb2.Scenario.FromString(data)
    if len(scenario.map_features) == 0:
        print(f"Scenario {scenario.scenario_id} has no map features.")
        continue

The dataset version is 1.2.0, which I downloaded from here. Could you help confirm if the files under scenario/testing on the Google Cloud servers are correct? Thanks!

Jan 30 '24 20:01 juanwulu

Adding @scott-ettinger

If I'm not mistaken, this was flagged some time ago and it seems to be a small issue on our side (@ChocolateDave can you please confirm that it's just these 9 Scenarios?). We'll look into fixing that extraction error or just discarding those examples in the upcoming release.

Thanks for flagging!

Feb 01 '24 14:02 nicomon24

Thanks for the response, @nicomon24.

I confirm that these are the only nine scenarios missing from the testing dataset. I have double-checked the issue by running codes on both dataset versions 1.1.0 and 1.2.0.

But it seems there are also missing cases in the testing-interactive scenarios:

No.	Scenario ID	Missing from `testing`
1	e1f412d402676e57
2	9e0ed12773f813eb
3	ff6686b0e98d66ae	✓
4	981f5c4f61505759	✓
5	664537fc3819c08a
6	b6f042d4029a0297
7	1ebe3d70cb05a381
8	ab7571715a1d5193	✓
9	5a99e6200deb4792

Feb 01 '24 16:02 juanwulu

Thanks for flagging this.

Feb 01 '24 18:02 scott-ettinger

@nicomon24

Sorry, so how we deal with this when preparing submission for test set? Should we just skip those scenarios?

Apr 10 '24 17:04 pengzhenghao

@pengzhenghao For sim agents yes, these are not in the test set so you can safely skip them. For motion I need to check with Scott

Apr 11 '24 08:04 nicomon24

waymo-open-dataset waymo-open-dataset copied to clipboard

Missing Map Features in Testing Scenario Proto Dataset

waymo-open-dataset
waymo-open-dataset copied to clipboard