pdf2dcm
pdf2dcm copied to clipboard
Python Package for PDF to DICOM Conversion
pdf2dcm
PDF to DICOM Converter
A python package for PDF to Encapsulated DCM and PDF to DICOM RGB converter
SETUP
Python Package Setup
The python package is available for use on PyPI. It can be setup simply via pip
pip install pdf2dcm
To the check the setup, simply check the version number of the pdf2dcm
package by
python -c 'import pdf2dcm; print(pdf2dcm.__version__)'
Poppler Setup
Poppler is a popular project that is used for the creation of Dicom RGB Secondary Capture. You can check if you already have it installed by calling pdftoppm -h
in your terminal/cmd. To install poppler these are some of the recommended ways-
Conda
conda install -c conda-forge poppler
Ubuntu
sudo apt-get install poppler-utils
MacOS
brew install poppler
PDF to Encapsulated DCM
Usage
from pdf2dcm import Pdf2EncapsDCM
converter = Pdf2EncapsDCM()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file.dcm' ]
Parameters converter.run
:
-
path_pdf (str)
: path of the pdf that needs to be encapsulated -
path_template_dcm (str, optional)
: path to template for getting the repersonalisation of data. -
suffix (str, optional)
: suffix of the dicom files. Defaults to ".dcm".
Returns:
-
List[Path]
: list of path of the stored encapsulated dcm
PDF to RGB Secondary Capture DCM
Usage
from pdf2dcm import Pdf2RgbSC
converter = Pdf2RgbSC()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file_0.dcm', 'tests/test_data/test_file_1.dcm' ]
Parameters converter.run
:
-
path_pdf (str)
: path of the pdf that needs to be converted -
path_template_dcm (str, optional)
: path to template for getting the repersonalisation of data. -
suffix (str, optional)
: suffix of the dicom files. Defaults to ".dcm".
Returns:
-
List[Path]
: list of paths of the stored secondary capture dcm
Notes
- The name of the output dicom is same as the name of the input pdf
- If no template is provided no repersonalisation takes place
- It is possible to produce dicoms without a suffix by simply passing
suffix=""
to theconverter.run()
Repersonalisation
It is the process of copying over data regarding the identity of the encapsualted pdf from a template dicom. Currently, the fields that are repersonalised by default are-
- PatientName
- PatientID
- PatientSex
- StudyInstanceUID
- ~~SeriesInstanceUID~~
- ~~SOPInstanceUID~~
The fields SeriesInstanceUID
and SOPInstanceUID
have been removed from the repersonalization by copying as it violates the DICOM standards.
You can set the fields to repersonalize by passing repersonalisation_fields into Pdf2EncapsDCM()
, or Pdf2RgbSC()
Example:
fields = [
"PatientName",
"PatientID",
"PatientSex",
"StudyInstanceUID",
"AccessionNumber"
]
converter = Pdf2RgbSC(repersonalisation_fields=fields)
note: this will overwrite the default fields.