torchxrayvision icon indicating copy to clipboard operation
torchxrayvision copied to clipboard

Do you have models trained and/or evaluated on Chest ImaGenome?

Open PabloMessina opened this issue 2 years ago • 6 comments

Hi, just a very quick question. Chest ImaGenome (https://physionet.org/content/chest-imagenome/1.0.0/) provides very fine-grained labels and bounding boxes for most images in MIMIC-CXR. Do you guys have models trained and/or evaluated using this dataset?

Best regards, Pablo

PabloMessina avatar Feb 17 '23 15:02 PabloMessina

Currently no. But it looks like a very awesome dataset that we should have. Here is a notebook showing the current datasets with mask information: https://github.com/mlmed/torchxrayvision/blob/master/scripts/xray_masks.ipynb

ieee8023 avatar Feb 17 '23 19:02 ieee8023

That's awesome. You already support several datasets with mask and bounding box annotations. The cool thing about Chest ImaGenome is that it comes with bounding boxes for 36 different anatomical locations + very fine-grained scene graphs describing frontal chest X-ray images, for 240K+ images of MIMIC-CXR, so it's a very large scale dataset. You can read more about how they developed the dataset here: https://arxiv.org/pdf/2108.00316.pdf

Here are some papers that have already used Chest ImaGenome, to give you an idea of the things that can be done with it:

  • AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray: https://arxiv.org/pdf/2105.09937.pdf
  • Anatomy Aware Model for Longitudinal Relationships Change in CXRs: https://arxiv.org/pdf/2208.03873.pdf
  • Anatomy-Guided Weakly-Supervised Abnormality Localization in Chest X-rays: https://arxiv.org/pdf/2206.12704.pdf
  • Few-shot Structured Radiology Report Generation Using Natural Language Prompts: https://arxiv.org/pdf/2203.15723.pdf

PabloMessina avatar Feb 17 '23 23:02 PabloMessina

Thanks for that info! Do you know how to work with the data already? Can you help me get started with a dataset that loads the masks similar to the existing datasets to prepare a PR?

ieee8023 avatar Feb 18 '23 00:02 ieee8023

I've been playing around with the dataset for a while, but I have my own ad-hoc customized ways to load and post-process the data. I can point you to specific sections of my code if that helps though. For example:

  • Scripts to read and compute post-processed versions of the annotations: https://github.com/PabloMessina/MedVQA/tree/master/medvqa/scripts/chest_imagenome
  • A bunch of utility functions to work with Chest ImaGenome: https://github.com/PabloMessina/MedVQA/tree/master/medvqa/datasets/chest_imagenome
  • Here I combine data from MIMIC-CXR with data from Chest ImaGenome to create a dataset and a dataloader (plus a bunch of other things). Not the prettiest code: https://github.com/PabloMessina/MedVQA/blob/master/medvqa/datasets/mimiccxr/mimiccxr_vision_dataset_management.py

PabloMessina avatar Feb 18 '23 00:02 PabloMessina

This jupyter notebook might be helpful as well: https://github.com/PabloMessina/MedVQA/blob/master/medvqa/datasets/notebooks/Exploring%20Chest%20ImaGenome.ipynb (Note: it's a work in progress)

PabloMessina avatar Feb 18 '23 00:02 PabloMessina

Hey @ieee8023, just so you know, there is another paper recently published in CVPR 2023 using the Chest ImaGenome dataset: Interactive and Explainable Region-guided Radiology Report Generation:

  • paper: https://arxiv.org/pdf/2304.08295.pdf
  • code: https://github.com/ttanida/rgrg

PabloMessina avatar Apr 28 '23 13:04 PabloMessina