hail
hail copied to clipboard
hl.Table.to_pandas() generates a dtype=string dataframe which is still experimental
I encountered an error in an external package when I used a Hail-generated pandas data frame, which is due to an unsupported dtype pandas.StringDtype
.
https://github.com/biocore-ntnu/pyranges/pull/264
Given it's still experimental in pandas, can we have an option to generate a data frame that have dtype=object
string columns? or maybe, we should make dtype=object
default.
https://github.com/hail-is/hail/blob/c4b09953f62cea090c8ab2026bc81851b9f4d64a/hail/python/hail/table.py#L3345-L3346
Great suggestion Masa! We can provide a types
argument to to_pandas
which allows the user to override the type for a subset of columns. I've marked this help wanted. If someone on the team has some spare cycles they might pick it up. We also welcome PRs to make this change!
Closing the loop: this was released into 0.2.110!