django-mysql
django-mysql copied to clipboard
GroupConcat deserialize to lists or sets
It's kinda annoying that GroupConcat
doesn't automatically split back into lists (or maybe sets when distinct=True
) on the python side - 90% of the use cases require the user to do this manually with str.split
on the result before being able to use this code.
I've looked into using the list and set fields from #10 and #11 but it's a bit complicated - they don't support the custom separators, the datatypes aren't so neat to convert, it requires a bit of trickery. It might still be possible, otherwise we might have to just write custom code or some dynamic field type to achieve this - so that it ties in with the rest of the ORM.
This would be a really nice feature. currently i am working around this by doing
Subquery(Coalesce(qs.annotate(annotation=GroupConcat('related_pk')).values('annotation'), Value(''), output_field=ListCharField(models.CharField())))
but this is buggy, when you want to do __contains
on the subquery. With the Coalesce
above it searches with the Value('')
in the subquery expression and if you remove the Coalesce
and replace it with an ExpressionWrapper
to do the conversion to ListCharField
, then you end up with
File "/var/env/lib/python3.6/site-packages/django_mysql/models/lookups.py", line 175, in as_sql
params = lhs_params + rhs_params
TypeError: can only concatenate tuple (not "list") to tuple
edit: Using django 2.2
edit 2:
currently i am monkey patching SetContains.as_sql
to feature
params = list(lhs_params) + list(rhs_params)
Re: edit 2, that seems like it shouldn't be necessary as both vars should be lists. Can you give the full traceback? Where did the tuple
come from? Transforms inside Django itself expect to receive lists for both params variables, e.g. https://github.com/django/django/blob/master/django/contrib/postgres/lookups.py#L12
Here is the full traceback:
Traceback (most recent call last):
File "<input>", line 1, in <module>
City.objects.annotate_region_slugs().filter(region_slugs__contains='foo-slug')
File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 250, in __repr__
data = list(self[:REPR_OUTPUT_SIZE + 1])
File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 274, in __iter__
self._fetch_all()
File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 1242, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/var/env/lib/python3.6/site-packages/django/db/models/query.py", line 55, in __iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1087, in execute_sql
sql, params = self.as_sql()
File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 489, in as_sql
where, w_params = self.compile(self.where) if self.where is not None else ("", [])
File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 405, in compile
sql, params = node.as_sql(self, self.connection)
File "/var/env/lib/python3.6/site-packages/django/db/models/sql/where.py", line 81, in as_sql
sql, params = compiler.compile(child)
File "/var/env/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 405, in compile
sql, params = node.as_sql(self, self.connection)
File "/code/src/core/fields.py", line 65, in as_sql
params = lhs_params + rhs_params
TypeError: can only concatenate tuple (not "list") to tuple
i monkey patched the method like so to print some outputs:
def as_sql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
print(repr(lhs))
print(repr(lhs_params))
print(repr(rhs))
print(repr(rhs_params))
params = lhs_params + rhs_params
# Put rhs on the left since that's the order FIND_IN_SET uses
return 'FIND_IN_SET(%s, %s)' % (rhs, lhs), params
it prints this:
'(SELECT GROUP_CONCAT(U0.`slug`) AS `slugs` FROM `core_region` U0 WHERE ST_Contains(U0.`polygon`, (`core_city`.`point`)))'
()
'%s'
['foo-slug']
Huh. It looks like this might be a bug in django itself, the subquery on the left hand side is what's returning its params in a tuple. And indeed it seems that the SQLCompiler class forces tuples: https://github.com/django/django/blob/master/django/db/models/sql/compiler.py#L614 . Can you raise this on https://code.djangoproject.com/ , CC me, and even look into a patch? 😇 https://docs.djangoproject.com/en/dev/internals/contributing/
Tuples seem the way to go, so this might require fixing in here, i guess.
Version 3.2.0 treats everything as tuples, try that.
Thank you very much, don't have to monkey patch it anymore :-)
Not completelly sure whehther is related to this particular issue, but with 1.11.17/3.2.0 for django/django-mysql, I've need to patch following https://jira.mongodb.org/browse/PYMODM-77.
With that change, I can successfully run
[...].annotate(members=GroupConcat('user_id', output_field=SetCharField(CharField))).[...]
I also needed to raise value of group_concat_max_len
mysql variable, but that's just because my output lists are very long and got truncated in the mysql side
I'm still in the process to review this, because I get a discrepancy in the number of entries reported by count()
and group_concat()
with the same group_by/annotation
EDIT: Forget about the mismatch. It just happened that my sample record seemed to be completed with the extended length (not clearly ending with ,
as with original sizes). Setting max_len to x100 (instead of x10 as originally), produced the right result.
@javiplx I am not sure how that MongoDB ORM issue aligns with here. Can yuo make a PR for your patch, including a test that fails before and passes after?
I wasn't even aware that it was mongodb for django. Just finding matches for the exception I got. I'll fork and submit the changes.
The PR is there (adamchainz/django-mysql#646). It cannot be merged because i started from tag v3.2.0 but you can see changes there.
And it is even more funnier.
With the original 3.2.0 from PyPI, I got
TypeError: to_python() missing 1 required positional argument: 'value'
which is the text what lead me to the "solution" (which is the first commit on the PR), that enabled me to give explicit output_field
argument. Then, my laziness suggested me to supply default fields on code (as already suggested there), and I found that when output_field
is not explicitly given I get a similar exception , but complaining about too many positional arguments (our python is 3.7.4 in case that matters).
So, I'm somewhat happy because second commit is what I actually wanted, but also a bit disliked. Anyway, I'll make shortly a second PR with that second commit
I wasn't even aware that it was mongodb for django. It's not even for Django. They just happened to copy the Django ORM for their MongoDB ORM.