slim icon indicating copy to clipboard operation
slim copied to clipboard

IDC Slim integration regression testing

Open fedorov opened this issue 3 years ago • 17 comments

We need to define what manual regression testing should be performed for Slim integrated with IDC, what studies/series should be confirmed to work on each IDC release.

This is a document we've been using to drive OHIF Viewer testing: https://docs.google.com/document/d/1l0RP3H6D9OCI3J2YubzFidBszu1NvB9lhUcFflnWNLc/edit, we should have something similar for Slim.

cc: @pgundluru @dclunie

fedorov avatar Jan 31 '22 18:01 fedorov

We should select a diverse set of images with different transfer syntaxes. Even images with the same transfer syntax may be more or less complex due to different codec parameters such as color space, channel subsampling ratios, etc. @dclunie could you help us put together a list of "tricky" images?

hackermd avatar Jan 31 '22 18:01 hackermd

@hackermd @dclunie if you can put together the list of DICOM attributes we should "sample", I can do the queries to come up with the specific representative studies/series

fedorov avatar Jan 31 '22 18:01 fedorov

@fedorov I suggest the following criteria:

  • Transfer Syntax UID: I am not sure you have that indexed in the database, since it is not part of a Data Set (but rather the File Meta Information). In addition, we may want to consider coding parameters that are not captured by DICOM attributes, but can be found in the header of the JPEG or JPEG 2000 bitstreams. Since @dclunie has performed the conversions, he may have insight into this information.

  • Number of Study-related Series: Note that this is an attribute included in DICOMweb search results. Not sure how you would query for that in the database. It would be useful to use studies for testing that contain more than one series (i.e., more than one digital slide) so that we can test the ability of Slim to switch between slides and update the UI accordingly.

  • Manufacturer Model Name and Software Versions: If possible, we should cover a range of different scanners with different software versions.

  • Clinical Trial Protocol ID: We probably want images from different collections in the test set.

hackermd avatar Jan 31 '22 19:01 hackermd

@hackermd @dclunie I made a dashboard to explore those aspects, and also include links to the current selection at the study and series level: https://datastudio.google.com/reporting/9c65802e-979b-4965-8b90-3bf4e2bcc32e

fedorov avatar Jan 31 '22 21:01 fedorov

@hackermd thank you for the explanation today that some of the metadata that might be important for defining test instances is hiding in the JPEG header, and is not available in DICOM metadata.

Should we consider extracting those relevant attributes as part of our ETL process, and including them in some auxiliary table to facilitate that aspect of data exploration? We may not even expose those to the users, but at least have them handy to help with testing.

fedorov avatar Feb 01 '22 20:02 fedorov

That would probably be a good idea. We could extract the information from the header of the first frame and include it into the table as a JSON string.

hackermd avatar Feb 01 '22 20:02 hackermd

Do you have tools/instructions how to do this? How do we proceed?

Looping in Bill Clifford @bcli4d since - he is handling IDC ETL.

fedorov avatar Feb 01 '22 21:02 fedorov

@bcli4d how have you implemented the IDC ETL? What programming language do you use?

hackermd avatar Feb 01 '22 22:02 hackermd

We can probably experiment with extraction using the tools you have right now Markus in Google Colab, use the result to define the initial regression testing samples, and based on that experience decide how to integrate this into ETL. I don't think we need to modify ETL process yet. I added Bill to get his thoughts to help with planning.

fedorov avatar Feb 01 '22 22:02 fedorov

Since we don't actually know what the problem is, we don't know what J2K bitstream metadata we need - I suggest you hold off until we find a signature for one of the problem cases. It may be sufficient to use information from the SVS TIFF ImageDescription tag, which describes some aspects of the codec used, and I have copied into ImageComments.

dclunie avatar Feb 01 '22 22:02 dclunie

I added ImageComments to the dashboard.

image

fedorov avatar Feb 01 '22 22:02 fedorov

@hackermd: ETL is Python (plus some SQL).

bcli4d avatar Feb 01 '22 23:02 bcli4d

@hackermd - @pgundluru is in the process of testing the upcoming IDC release. Since we do not have the regression steps, please let us know if we should do anything beyond what we did so far to debug the JPEG2000 issue with 0.4.5.

fedorov avatar Feb 03 '22 21:02 fedorov

@hackermd, following up on the discussion Thu, here's the list of all combinations of SoftwareVersions and TransferSyntaxEncoding for the SM series we have in IDC right now. Let me know if this is what you had in mind, or you want me to give you a list that samples some other attribute.

TransferSyntaxUID,SoftwareVersions_str,slim_url 1.2.840.10008.1.2.4.50,v12.0.15/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.193279499701610990504788547870106775285/series/1.3.6.1.4.1.5962.99.1.215389419.1022426870.1640892929259.2.0 1.2.840.10008.1.2.1,vFS90 01/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.74148830892982664081128985444965701745/series/1.3.6.1.4.1.5962.99.1.267838992.1699372767.1640945378832.2.0 1.2.840.10008.1.2.1,v12.0.15/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.193279499701610990504788547870106775285/series/1.3.6.1.4.1.5962.99.1.215389419.1022426870.1640892929259.2.0 1.2.840.10008.1.2.4.91,v12.4.0/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.241119208575412892095941144558857582708/series/1.3.6.1.4.1.5962.99.1.241357189.1829878970.1640918897029.2.0 1.2.840.10008.1.2.4.50,vFS90 01/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.52692399237314253327736100986240928861/series/1.3.6.1.4.1.5962.99.1.155609303.645601545.1640833149143.2.0 1.2.840.10008.1.2.1,v12.0.11/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.16181252012499544165879445836446987048/series/1.3.6.1.4.1.5962.99.1.208290886.1784869798.1640885830726.2.0 1.2.840.10008.1.2.4.91,vFS90 01/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.74148830892982664081128985444965701745/series/1.3.6.1.4.1.5962.99.1.267838992.1699372767.1640945378832.2.0 1.2.840.10008.1.2.4.91,v12.0.15/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.62785078102556377485575783168690471946/series/1.3.6.1.4.1.5962.99.1.263683866.789827578.1640941223706.2.0 1.2.840.10008.1.2.4.50,v12.0.11/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.16181252012499544165879445836446987048/series/1.3.6.1.4.1.5962.99.1.208290886.1784869798.1640885830726.2.0 1.2.840.10008.1.2.1,v12.4.0/Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.241119208575412892095941144558857582708/series/1.3.6.1.4.1.5962.99.1.241357189.1829878970.1640918897029.2.0 1.2.840.10008.1.2.4.50,Sat Nov 20 10:02:55 EST 2021,https://dev-viewer.canceridc.dev/slim/studies/2.25.64952420005001016100439665313692637663/series/1.3.6.1.4.1.5962.99.1.237069029.1022471484.1640914608869.2.0

This is the query I used to get the above:

SELECT
  TransferSyntaxUID,
  ARRAY_TO_STRING(SoftwareVersions,'/') AS SoftwareVersions_str,
  ANY_VALUE(CONCAT("https://dev-viewer.canceridc.dev/slim/studies/",StudyInstanceUID,'/series/',SeriesInstanceUID)) AS slim_url
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  Modality="SM"
GROUP BY
  TransferSyntaxUID,
  SoftwareVersions_str

fedorov avatar Feb 12 '22 03:02 fedorov

@hackermd other than checking the URLs above, is there anything else we should do for regression testing of the IDC Slim 0.5.0 instance?

fedorov avatar Mar 17 '22 20:03 fedorov

I would just make sure that we also assert that the metadata is displayed correctly.

hackermd avatar Mar 22 '22 16:03 hackermd

I need to reconcile the list above with the test inventory in https://docs.google.com/spreadsheets/d/12n1AWUynEFPatmNUphyW9IhOHaePN2jNUrzzjLunJtg/edit#gid=0, and coordinate with Poojitha to include this into her testing process.

fedorov avatar May 04 '23 21:05 fedorov