snps icon indicating copy to clipboard operation
snps copied to clipboard

Use Pandas nullable integer for position in for normalized snps dataframe

Open afaulconbridge opened this issue 5 years ago • 1 comments

Along the same lines as #108 has it been considered to use the Pandas nullable integer datatype (pd.Int64Dtype()) for pos? More details here. We've seen a number of files that fail to parse because the position information for a small number of rows is missing (for example, on a RSID with multiple possible locations).

afaulconbridge avatar Nov 07 '20 08:11 afaulconbridge

Interesting. Yes a nullable integer dtype would be good to handle these cases. But let's go with pd.UInt32Dtype(), which would minimize memory usage.

apriha avatar Nov 09 '20 06:11 apriha