lightning-Covid19 data augmentation

data augmentation

Open Borda opened this issue 4 years ago • 14 comments

Add reasonable image augmentation

horizontal/vertical flip
rotation
zoom
etc

Mar 17 '20 10:03 Borda

cool, we have all this in kornia

Mar 17 '20 12:03 edgarriba

/cc @ducha-aiki @shijianjian @anguelos

Mar 17 '20 13:03 edgarriba

Horizontal flip looks like a suitable augmentation, I'm not completely sure if vertical flip/rotation introduces interesting priors as X-rays are usually similarly oriented

Mar 17 '20 13:03 shpotes

rotation limited to small degrees I guess yes

Mar 17 '20 13:03 edgarriba

Horizontal flip looks like a suitable augmentation, I'm not completely sure if vertical flip/rotation introduces interesting priors as X-rays are usually similarly oriented

unless there is the assumption that the object looks vertical different... but this could be just an extra training parameter, right?

Mar 17 '20 13:03 Borda

Where is the data in the first place? no link in readme

Mar 17 '20 14:03 bluesky314

data link is https://github.com/PyTorchLightning/lightning-Covid19/issues/2 here

Mar 17 '20 14:03 ducha-aiki

In the Chester paper In figure 3. We can see that for pneumonia specifically augmentation might even do bad. If I am reading the plots right, the first column (undistorted test set) seems the most important. It seems that modest rotation scale and translation is the best augmentation. 15deg, 10%, and 10% respectively.

Mar 17 '20 15:03 anguelos

Generally, I think it should be alright as long as the label will not be changed by augmentation methods. For instance, ElasticTransform is probably a dangerous move. It would be best if we can invite a chest CT expert for more guidance.

If I understand this right, this project aims to tell Covid-19 out of other pneumonia pathologies like SARS, etc. Thus, we also need more support on pathology understanding to emphasize the most correlated features in the preprocessing phase and augmentations. In a clinical perspective, I think it also helps if we tell how CT experts make their decisions.

Mar 17 '20 18:03 shijianjian

just noticed that the images come in a range between ~ +- 1000

Mar 27 '20 16:03 edgarriba

it is quite common for medical images as they can be also in tiff with some offset :]

Mar 27 '20 16:03 Borda

gotcha. And do we want that for training ? https://github.com/mlmed/torchxrayvision/blob/master/torchxrayvision/datasets.py#L47-L51

I think the dataset generator can be improved somehow

Mar 27 '20 17:03 edgarriba

I think that we shall scale then anyway with the mean and SDT to about (-1, 1) interval

Mar 27 '20 18:03 Borda

@Borda sure. Apparently images in this dataset are in png, jpg and jpeg. Some my guess no need apply apply an initial conversion. Please, also check my comment in here: https://github.com/PyTorchLightning/lightning-Covid19/pull/18#discussion_r399652347

not sure what would be the best. My guess would the best to analyses the whole image and create some kind of attention to not miss any part.

Mar 28 '20 11:03 edgarriba

lightning-Covid19 lightning-Covid19 copied to clipboard

data augmentation

lightning-Covid19
lightning-Covid19 copied to clipboard