training icon indicating copy to clipboard operation
training copied to clipboard

Image Segmentation: Data Preprocessing Verification -- Checksum fails

Open nmcglo opened this issue 2 years ago • 4 comments

The Image Segmentation (Pytorch UNet3D) benchmark relies on the KITS19 dataset. I've followed the instructions from the KITS19 dataset repository for downloading the dataset and have been trying to run the data preprocessing script (https://github.com/mlcommons/training/blob/master/image_segmentation/pytorch/preprocess_dataset.py)

The cases all pre-process just fine but I get an error when the verify_dataset() function is called. At least one of the cases (Case 00043 specifically) has an md5 checksum hash that does not match the expected checksum value from the mlcommons image segmentation repo (https://github.com/mlcommons/training/blob/master/image_segmentation/pytorch/checksum.json). I haven't exhaustively checked each of them but if I run my own md5 hash on these case files, a random sampling of 10 or so all matched the expected values but the hash for case 43 does not match.

I have downloaded the dataset using both download scripts a total of 7 times and get the exact same invalid checksum each time so it isn't a corrupted download (at least on my end).

nmcglo avatar Apr 19 '22 21:04 nmcglo

Hi @nmcglohon , is this still a problem for you? I'm going to take a look and try to repro early next week

mmarcinkiewicz avatar Nov 04 '22 16:11 mmarcinkiewicz

I am able to repro. I'll reach out to the dataset owners asking for clarification whether anything has changed.

mmarcinkiewicz avatar Nov 15 '22 08:11 mmarcinkiewicz

Thanks, apologies for the delay in response - I was away last month.

nmcglo avatar Dec 02 '22 18:12 nmcglo

I have the same problem, I get an error when the verify_dataset() function is called,Has this issue been resolved? or can i skip the function?

sepzjh avatar Sep 24 '23 13:09 sepzjh

Closing because we are dropping UNet3D

hiwotadese avatar Jul 25 '24 16:07 hiwotadese

I have ran into this error as well during dataset verification:

Case 299. Skipped.
Mean value: -1.850000023841858, std: 0.9800000190734863, d: 256.0, h: 333.0, w: 333.0
  0%|▊                                                                                                                                                                                | 2/420 [00:00<01:08,  6.12it/s]
Traceback (most recent call last):
  File "preprocess_dataset.py", line 147, in <module>
    verify_dataset(args.results_dir)
  File "preprocess_dataset.py", line 132, in verify_dataset
    assert md5_hash == source[volume], f"Invalid hash for {volume}."
AssertionError: Invalid hash for case_00183_x.npy.

This time on case_00183_x.npy

@hiwotadese Can I please ask why UNet3D is being dropped? Which part of the MLCommons WG work on training and inference?

wahabk avatar Jul 30 '24 10:07 wahabk

There were multiple reasons taken into consideration before dropping unet3d from the training benchmark suite. In case you are interested, the training WG meets weekly and decisions regarding which benchmarks to keep and which ones to drop are discussed in that forum.

This table lists all the current benchmarks for Training v4.1.

Note that unet3d is still a part of the inference benchmark suite as listed in this table

ShriyaPalsamudram avatar Jul 31 '24 14:07 ShriyaPalsamudram