model-zoo icon indicating copy to clipboard operation
model-zoo copied to clipboard

Training and inference on new data for lung nodule detection in ct

Open Thibescobar opened this issue 1 year ago • 6 comments

Hello, I followed the tutorial and model zoo readme to reproduce lung nodule detection on the lunal dataset but i find the bundle to be specific to this dataset.

Is there a way to train and infer on new data easily ?

Thibescobar avatar Apr 03 '24 12:04 Thibescobar

Similar question here Are you using 3D slicer to do the inference and training ? I want to know how to to submit the labels to the server, when I am doing the deep edit mode, we can just submit the segmentation for training, but now seems that submitting the .JSON markups is not working

longchingkwok331 avatar Apr 06 '24 06:04 longchingkwok331

Hello, I followed the tutorial and model zoo readme to reproduce lung nodule detection on the lunal dataset but i find the bundle to be specific to this dataset.

Is there a way to train and infer on new data easily ?

Have you figured this out?

junxiant avatar Jun 04 '24 10:06 junxiant

@Can-Zhao @yiheng-wang-nv I think you were both involved with putting this bundle together, would you have some insights here? Thanks!

ericspod avatar Jun 04 '24 14:06 ericspod

@Can-Zhao @yiheng-wang-nv I think you were both involved with putting this bundle together, would you have some insights here? Thanks!

Hi @Can-Zhao , could you give some suggestions?

yiheng-wang-nv avatar Jul 16 '24 07:07 yiheng-wang-nv

I have tried training on new data, maybe I can share some?

If we follow the detection tutorial and make some changes:

  1. Data needs to be in .nii.gz format, spacing has to be consistent across all data
  2. Under the config json file, modify the parameters as required. Reduce the patch_size or batch_size if GPU is a constraint.
  3. New dataset's labelling json file needs to follow the original dataset json file format.
  4. In luna16_prepare_env_files.py change the raw_data_base_dir, resampled_data_base_dir to the new data folder location, downloaded_datasplit_dir should be changed to the new dataset's labelling json file. modify the for loop depending if using cross validation folds or not. modify the output names as requried
  5. In luna16_training.py you need to modify the COCOMetric classes parameter to suit the new dataset class

Actually i think there are a lot of things to be changed if training on a new dataset that is not luna16.

If following the model zoo's readme you need to modify all the files in the config folder,

junxiant avatar Jul 18 '24 11:07 junxiant

I have tried training on new data, maybe I can share some?

If we follow the detection tutorial and make some changes:

  1. Data needs to be in .nii.gz format, spacing has to be consistent across all data
  2. Under the config json file, modify the parameters as required. Reduce the patch_size or batch_size if GPU is a constraint.
  3. New dataset's labelling json file needs to follow the original dataset json file format.
  4. In luna16_prepare_env_files.py change the raw_data_base_dir, resampled_data_base_dir to the new data folder location, downloaded_datasplit_dir should be changed to the new dataset's labelling json file. modify the for loop depending if using cross validation folds or not. modify the output names as requried
  5. In luna16_training.py you need to modify the COCOMetric classes parameter to suit the new dataset class

Actually i think there are a lot of things to be changed if training on a new dataset that is not luna16.

If following the model zoo's readme you need to modify all the files in the config folder,

That's very helpful! Thank you! I think we should add these to readme file. @yiheng-wang-nv What do you think?

Can-Zhao avatar Jul 18 '24 17:07 Can-Zhao