django-pandas
django-pandas copied to clipboard
read_frame bug: ForeignKey lookup fails if any Null values present
ForeignKey lookup works as long as no null values are present. But I also have some models where null values are allowable, for example:
class Foo(models.Model):
name = models.CharField(max_length=64, unique=True)
description = models.CharField(max_length=128, unique=True)
def __str__(self):
return self.name
class Sample(models.Model):
foo = models.ForeignKey(Foo, on_delete=models.PROTECT, null=True, blank=True)
read_frame
returns expected results as long as all of the qs objects are non null for field foo
:
In [25]: read_frame(Sample.objects.filter(Q(id=637)), fieldnames=['id', 'foo'])
Out[25]:
id foo
0 637 XY
...but if one null is present, all rows in the df become Null.
In [26]: read_frame(Sample.objects.filter(Q(id=637)|Q(id=241)), fieldnames=['id', 'foo'])
Out[26]:
id foo
0 241 None
1 637 None
Somewhat similar to #93 ?
@odoublewen Please let me know if the following work around seems useful to you.
MyModel (parent) in models.py
class MyModel(models.Model): MyModelId = models.BigAutoField( _('Id'), primary_key=True ) Name = models.CharField( _('Name'), max_length=225, null=True, blank=True ) Date = models.DateField( _('Date'), null=True, blank=True ) DateTime = models.DateTimeField( _('DateTime'), null=True, blank=True ) Integer = models.IntegerField( _('Integer'), null=True, blank=True ) Float = models.FloatField( _('Float'), null=True, blank=True, ) def __str__(self): return f'{self.pk} - {self.Name}' class Meta: db_table = 'MyModel' verbose_name = 'My Model Object' verbose_name_plural = 'My Model Objects' @classmethod def get_dataframe(cls, instance=None): if not instance: qs = cls.objects.all() return read_frame(qs, ('MyModelId', 'MyForeignKeyModels__Name', 'MyForeignKeyModels__pk', 'Name', 'Date', 'DateTime', 'Integer', 'Float',))
MyForeignKeyModel is defined as
class MyForeignKeyModel(models.Model): MyForeignKeyModelId = models.BigAutoField( _('Id'), primary_key=True ) MyModel = models.ForeignKey( 'MyModel', on_delete=models.CASCADE, related_name='MyForeignKeyModels', null=True, blank=True ) Name = models.CharField( _('Name'), max_length=225, null=True, blank=True ) Date = models.DateField( _('Date'), null=True, blank=True ) DateTime = models.DateTimeField( _('DateTime'), null=True, blank=True ) Integer = models.IntegerField( _('Integer'), null=True, blank=True ) Float = models.FloatField( _('Float'), null=True, blank=True, ) def __str__(self): return f'{self.pk} - {self.Name}' @classmethod def get_dataframe(cls, instance=None, *args, **kwargs): if not instance: qs = cls.objects.all() fields_list = [] for field in cls._meta.fields: import ipdb ipdb.set_trace() fields_list.append(field.name) return read_frame(qs, ('MyForeignKeyModelId', 'Name', 'Date', 'DateTime', 'Integer', 'Float', 'MyModel__Name')) class Meta: db_table = 'MyForeignKeyModel' verbose_name = 'My Foreign Key Model' verbose_name_plural = 'My Foreign Key Models'
In [1]: from MyApp1.models import *
In [2]: df = MyModel.get_dataframe() MyApp1.MyModel.MyModelId <ManyToOneRel: MyApp1.myforeignkeymodel> MyApp1.MyForeignKeyModel.Name <ManyToOneRel: MyApp1.myforeignkeymodel>
FieldDoesNotExist => MyForeignKeyModel has no field named 'pk' 'Options' object has no attribute 'get_all_related_objects_with_model' :::: DEPRECATED FROM DJANGO 1.1 Django Docs for Deprecation
The result I get on calling the classmethod of MyModel is :
MyModelId MyForeignKeyModels__Name MyForeignKeyModels__pk Name Date DateTime Integer Float 0 1 huyjgjhm 1 masdfasdf 2023-04-28 2023-04-28 07:14:16+00:00 4 3.122 1 1 bvnbvnvb 2 masdfasdf 2023-04-28 2023-04-28 07:14:16+00:00 4 3.122 2 2 321321 5 ppolkjm 2023-04-28 NaT 3423 123.000 3 2 ij9045 6 ppolkjm 2023-04-28 NaT 3423 123.000
and the result on calling the classmethod of MyForeignKeyModel is :
In [3]: df Out[3]: MyForeignKeyModelId Name Date DateTime Integer Float MyModel__Name 0 1 huyjgjhm 2023-04-28 2023-04-28 07:14:28+00:00 NaN NaN masdfasdf 1 2 bvnbvnvb 2023-04-28 NaT 45.0 34.000 masdfasdf 2 3 sdfgdsfg 2023-04-28 2023-04-28 07:15:10+00:00 333.0 2342.000 None 3 4 gfhju767 2023-04-28 2023-04-28 07:15:31+00:00 12323.0 NaN None 4 5 321321 2023-04-28 2023-04-28 07:16:06+00:00 NaN 112.000 ppolkjm 5 6 ij9045 2023-04-28 NaT 7744.0 1.025 ppolkjm
NOTE: There were 2 objects of MyModel and 6 objects of MyForeignKetModel
2 of them (id 3 and 4) had null value for the Foreign Key
Good afternoon Also faced with such a problem If, when sampling from a model with a ForeignKey column, one record has a value, and the second one does not, then both values have None And if both entries with ForeignKey were filled in, then everything will be fine.
I did a little research and realized that in django_pandas/utils.py
def replace_pk(model):
base_cache_key = get_base_cache_key(model)
def get_cache_key_from_pk(pk):
return None if pk is None else base_cache_key % str(pk)
and if one of the ForeignKey records arrives empty, then in the get_cache_key_from_pk function, the pk parameter comes as float for all filled records and NoneType for non-filled ones, and if all records have a ForeignKey other than Null, then pk comes as int. Well, from here, after executing the get_cache_key_from_pk function for pk with the float type, it adds '.0' to the identifier, which is then not found.
if you change instead to int(pk), then everything works
def replace_pk(model):
base_cache_key = get_base_cache_key(model)
def get_cache_key_from_pk(pk):
return None if pk is None else base_cache_key % int(pk)
Dear developers, please check this and change it in new versions if this is the right solution