Data structure for flat and dark fields
Determine flat field and dark field data structures for normalisation. This is needed for Zeiss data, Diamond data. What to do when no flat and/or darks such as for Nikon data. To hold/load in darks as separate AcqData and flats as separate AcqData, or hold darks and flats within data AcqData?
Also related to #30
This is also needed for nikon data collected with IPC.
I think as all the raw projections, dark and flat are all needed to describe the data they should be held as one set of AcqData. The normalisation processor should return a new AcqData with the normalised data. Then at least each AcqData fully describes the data set.
We agreed to pass 3 AcqData (data, flat and dark) to the normaliser, which will return a new AcqData.
We also need to enhance the normalisation processor to work as:
- If passed a single flat and dark it will apply a simple normalisation.
- If passed two flats and darks it can perform a linear interpolation to correct each projection (how do we determine start and end? do we want to handle more than 2 points?).
- If passed multiple flats and darks it can average them together and normalise
- If the number of projections in each AcqData container match it'll perform a 1-to-1 correction which will allow users to pre-calculate a per-projection correction however they want.
This seems to require update of our loaders to return (up to) three different DataAcquisition objects: projections, flats and darks.
If multiple flats and darks, what should the dimension holding those be called? In projections it is "angles", but "angles" is not really a fitting names for the dimension of multiple flats/darks.
Related is also #30 which could allow to average a set of flats and darks within the normaliser.
We discussed pros and cons of using a separate class for NormalisationData versus including flat and darkfields in AcquistionData
NormalisationData Suggested structure:
AcquisitionData
Array
Geometry
IsNormalised boolean
NormalisationData
FlatField
DarkField
NormalisationGeometry
AcquisitionGeometry
Config
System
Angles
Panel
DimensionLabels(horizontal, vertical, angles, …)
Link to NormalisationGeometry
NormalisationGeometry
DarkField dictionary
labels: angles, repeats, before, after
DimensionLabels(horizontal, vertical, N)
FlatField dictionary
labels: angles, repeats, before, after
DimensionLabels(horizontal, vertical, N)
Pros:
- Data can be kept after normalisation
Cons:
- Would have to apply Slicer and Binner separately to each object
- Would have to return both objects from the loader and pass both objects to the normaliser
AcquistionData Suggested structure:
AcquisitionData
Array
Geometry
FlatField
DarkField
IsNormalised boolean
AcquisitionGeometry
Config
System
Angles
Panel
DimensionLabels(horizontal, vertical, angles, …)
Normalisation meta data
DarkField dictionary
labels: angles, repeats, before, after
FlatField dictionary
labels: angles, repeats, before, after
Pros:
- If Slicer or Binner is used in the reader, it’s easier to apply to the flat and dark when it's loaded, the same as the data array
- Flat/dark dictionary could be inferred from the flat/dark array size and the data geometry. So could initialise an object just from data, dark, flat and data geometry
Cons:
- Not obvious what we'd do with the flat and dark fields after normalisation, in some use-cases it would make sense to remove it, other times it may be useful to access and recheck the data later. One option would be to remove it if data is passed to out, but keep it otherwise.
Example use cases
-
Data with no darkfield or flatfield data AcquisitionData could be created as normal, with the default for darkfield/ flatfield data = None
-
Data with flatfield and darkfields for before and after the data was collected Reader returns AcquisitionData and/or NormalisationData containing data array, and array containing N flatfield and N darkfield scans, and flat/darkfield dictionaries labelling the indices of the flat and dark repeat scans as before and after e.g.
flatfield_dictionary = {'before' : slice(0, num_before ),
'after' : slice(num_before, num_after)}
If this is one of out default readers like ZeissDataReader it could be done easily, otherwise the user will have to point to the location of the dark and flat data Add a before and after method to Normaliser, which could access the correct arrays using the labels in the dictionary
- Data with flatfield per projection AcquisitionData and/or NormalisationData containing data array, and array containing N flatfield and N darkfield scans, and flat/darkfield dictionaries labelling the flat and dark repeat scans with labels
flatfield_dictionary = {'angles' : np.arange(0,180))}
Add a per projection method to Normaliser which accesses the angles in the dictionary
-
Data with multiple scans but no labels Reader returns AcquisitionData and/or NormalisationData containing data array, and array containing N flatfield and N darkfield scans. The flat/darkfield dictionaries might not be needed in this case because we don't have any information. Normaliser could use a simple method taking the mean of all flat and dark data
-
Loading data to a numpy array without a reader and creating an AcquisitionData by hand
ag = AcquisitionGeometry.create_Parallel3D().set_angles(np.linspace(0, 180, num=10)).set_panel((1000,1000))
ad = ag.allocate()
ad.fill(data_array)
dark_dictionary = {'before' : slice(0,9), 'after': slice(10,19)}
ad.set_darkfield(dark_array, dark_dictionary)
flat_dictionary = {'before' : slice(0,9), 'after': slice(10,19)}
ad.set_flatfield(flat_array, flat_dictionary)
norm = Normaliser(method='before_and_after')(ad)
norm.get_output(out=ad)
or using NormalisationData
ag = AcquisitionGeometry.create_Parallel3D().set_angles(np.linspace(0, 180, num=10)).set_panel((1000,1000))
ad = ag.allocate()
ad.fill(data_array)
dark_dictionary = {'before' : slice(0,9), 'after': slice(10,19)}
flat_dictionary = {'before' : slice(0,9), 'after': slice(10,19)}
nd = NormalisationData(dark_array, dark_dictionary, flat_array, flat_dictionary)
norm = Normaliser(nd, method='before_and_after')(ad)
norm.get_output(out=ad)
I notice in the 'cons' for 'NormalisationData', you say:
Would have to return both objects from the loader and pass both objects to the normaliser
So have I understood correctly that in either plan (storing flat/darks in NormalisationData or AcquisitionData), this would involve the readers returning the flats and darks (either as NormalisationData or as part of AcquisitionData)?
We need to think carefully about how this is done in each reader. For some (EPAC NXTomo for example), all of the info - the flats, darks, etc. are stored in the same HDF5 file, making it easy. For other cases, the data could be something like a stack of tiffs and the user will have to point to the stack of dark tiffs and stack of flat tiffs. Will these paths be inputs to the reader when it is set up, or later, optional calls e.g. set_dark_field_path() etc.?
Hi Laura, that's a good point. I think it would have to depend on the reader - so if we have use cases where we know everything is within one file, the reader would know where to look. I guess we need a way to make it customisable for lots of different types of data, so maybe a generic reader where you could specify the paths for dark and flatfields, or it could also be an option to just load the dark and flat field arrays yourself and create the AcquisitionData or NormalisationData object by hand.