PhysiCell icon indicating copy to clipboard operation
PhysiCell copied to clipboard

Enhancement: MCDS output.xml label carries additional information about the column data type

Open elmbeech opened this issue 6 months ago • 1 comments

For downstream analysis, it would often be good to know what the actual data type from a column for the output matrix is. In PhysiCell everything is outputted as float (double).

The suggestion is to have an additional tag dtype for data type, which specifies, if the column is supposed to be str (categorical), bool (categorical), int (numerical), float (numerical). This could look somehow like below:

<simplified_data type="matlab" source="PhysiCell" data_version="2">
    <labels> 
        <label index="0" size="1" units="none", dtype="str">ID</label>
        <label index="1" size="3" units="microns", dtype="float">position</label>
        [...]
        <label index="5" size="1" units="none", dtype="str">cell_type</label>
        <label index="6" size="1" units="none", dtype="str">cycle_model</label>
        [...]
        <label index="21" size="1" units="none", dtype="int">number_of_nuclei</label>
        [...]
        <label index="27" size="1" units="none", dtype="bool">dead</label>
        [...]
    </labels>
    <filename>output00000064_cells.mat</filename>
</simplified_data>

Thnak you!

elmbeech avatar Aug 05 '24 20:08 elmbeech