django-pandas icon indicating copy to clipboard operation
django-pandas copied to clipboard

Bug: Many to Many relational keys converts to floats.

Open hyzyla opened this issue 6 years ago • 4 comments

I have model Status and Station, that have many to many relations. Following command:

read_frame(Status.objects.values('id', 'stations__id'), verbose=False)

returns:

   id  stations__id
0   1           NaN
1   2           NaN
2   3           1.0
3   3           2.0
4   3           3.0
5   4           4.0
6   4           5.0
7   4           6.0

Keys must be integers, not floats. It seems like null converts all values in a column into floats.

hyzyla avatar Mar 05 '18 12:03 hyzyla

Indeed the null entry seems to be the reason for the float conversion.

It might be possible to fix this by changing the many-to-many column datatype to "Int64".

See also the pandas docs about nullable integer data type.

ckoerber avatar Oct 22 '19 17:10 ckoerber

@ckoerber, yes, I can convert. But for me, it seems strange that when the column doesn't have null, then it will have an integer type, and float when a column contains null values. Does pandas series support null/none values for integer type?

hyzyla avatar Oct 23 '19 07:10 hyzyla

@hyzyla, yes the "Int64" type supports integer and np.nan or None values (which I believe correspond to field null values)

E.g.,

> pd.Series([1, None])
0    1.0
1    NaN
dtype: float64

vs

> pd.Series([1, None], dtype="Int64")
0      1
1    NaN
dtype: Int64

This should be present in pandas version 0.24.0 regarding the above link.

ckoerber avatar Oct 23 '19 16:10 ckoerber

Thanks, I will try make PR for fixing this

hyzyla avatar Oct 23 '19 16:10 hyzyla