nwbwidgets icon indicating copy to clipboard operation
nwbwidgets copied to clipboard

Panda dataframe default display; skip DynamicTable Regions

Open CodyCBakerPhD opened this issue 3 years ago • 4 comments

I have an NWBFile where a column of the UnitsTable contains references to a DynamicTableRegion, specifically to particular IDs of the ElectrodesTable. Note however that what we're seeing here likely occurs for any table display for other DynamicTable objects referenced in a custom column.

Screenshot of issue together with actual file contents with HDFView (note that actually pulling the data through an NWBHDF5IO works correctly in returning the actual DynamicTableRegion object with all its data properly formatted): image

The specific tab here is the table view of the Units module in the widgets, which loads a simple view as a pandas dataframe display. Basically, it doesn't know how to map the object to a dataframe so it just grabs the string column names of the target table and breaks that down to a tuple of characters.

We could discuss what the best way of viewing such nested table data should be, but I think the easiest, quickest fix would be to just not try to display any column that has DynamicTableRegions values.

CodyCBakerPhD avatar Sep 23 '21 17:09 CodyCBakerPhD

@CodyCBakerPhD I think this is an issue with DynamicTable.to_dataframe. Could you try that, and if so, could you post this issue in HDMF?

bendichter avatar Sep 23 '21 17:09 bendichter

DynamicTable.to_dataframe by default resolves the DynamicTableRegion links and will return a nested DataFrame, i.e., each cell of the column will itself contain a DynamicTableRegion with the data from the linked table. If you don't want to_dataframe to resolve the links then simply set index=False and you should get the integer indices (or list/tuple of indices) for the rows that are being references. See here https://hdmf.readthedocs.io/en/stable/hdmf.common.table.html?highlight=DynamicTable#hdmf.common.table.DynamicTable.to_dataframe

oruebel avatar Sep 23 '21 17:09 oruebel

Also, if you have a DynamicTable ds and want to exclude any DynamicTableRegion columns when converting to a pandas DataFrame then I think the following should work:

ds.to_dataframe(exclude=set(ds.get_foreign_columns())

oruebel avatar Sep 23 '21 18:09 oruebel

If you don't want to_dataframe to resolve the links then simply set index=False and you should get the integer indices (or list/tuple of indices) for the rows that are being references.

Index = False is apparently the default that resulted in the above issue, setting Index = True gets things to look better at least.

image

I think this is an issue with DynamicTable.to_dataframe. Could you try that, and if so, could you post this issue in HDMF?

Given the above, I think it'd be best to just set Index=True in the view call for rendering dataframes; I'll get a quick PR up for that.

CodyCBakerPhD avatar Sep 27 '21 14:09 CodyCBakerPhD