django icon indicating copy to clipboard operation
django copied to clipboard

Draft PR for psycopg3 support. Fixes #33308

Open apollo13 opened this issue 1 year ago • 15 comments

What did I do? I took https://github.com/dvarrazzo/django-psycopg3-backend and blackified it + ported over most (all?) new commits. I am now opening this on GitHub to be able to nicely diff and start a discussion about whether we can support psycopg2 & 3 easiyl from the same codebase (I think we can).

apollo13 avatar May 12 '22 12:05 apollo13

Looking through the code base there are quite a few areas where it would probably be easier if we just assumed that if psycopg3 is installed that we want to use it; this might get a bit more fun for testing (extra environment, but realistically speaking we want to be on psycopg3 only in the longrun anways…)

apollo13 avatar May 12 '22 16:05 apollo13

We are down to three failures :)

apollo13 avatar May 12 '22 20:05 apollo13

@timgraham I'd love if you could look over this and maybe test your cockroachdb backend against this. It would be great if I don't fully break it :)

apollo13 avatar May 15 '22 11:05 apollo13

Besides the issue in SchemaLoggerTests, there's one other regression with psycopg2. (I'll work through the failures with psycopg3 later).

======================================================================
FAIL: test_orders_nulls_first_on_filtered_subquery (ordering.tests.OrderingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/ordering/tests.py", line 199, in test_orders_nulls_first_on_filtered_subquery
    self.assertQuerysetEqualReversible(
  File "/home/tim/code/django/tests/ordering/tests.py", line 129, in assertQuerysetEqualReversible
    self.assertSequenceEqual(queryset.reverse(), list(reversed(sequence)))
AssertionError: Sequences differ: <QuerySet [<Author: Name 3>, <Author: Name 1>, <Author: Name 2>]> != [<Author: Name 2>, <Author: Name 1>, <Author: Name 3>]

First differing element 0:
<Author: Name 3>
<Author: Name 2>

- <QuerySet [<Author: Name 3>, <Author: Name 1>, <Author: Name 2>]>
? ----------               ^                                   ^  -

+ [<Author: Name 2>, <Author: Name 1>, <Author: Name 3>]
?                ^                                   ^

The old SQL:

 SELECT DISTINCT "ordering_author"."id",
                "ordering_author"."name",
                "ordering_author"."editor_id",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") AS "last_date",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") IS NULL,
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id")
FROM   "ordering_author"
ORDER  BY (SELECT Max(U0."pub_date") AS "last_date"
           FROM   "ordering_article" U0
           WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                    AND Upper(U0."headline" :: text) LIKE Upper('%Article%') )
           GROUP  BY U0."author_id") IS NULL,
          (SELECT Max(U0."pub_date") AS "last_date"
           FROM   "ordering_article" U0
           WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                    AND Upper(U0."headline" :: text) LIKE Upper('%Article%') )
           GROUP  BY U0."author_id") DESC  

The new SQL:

 SELECT DISTINCT "ordering_author"."id",
                "ordering_author"."name",
                "ordering_author"."editor_id",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") AS "last_date",
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id") IS NULL,
                (SELECT Max(U0."pub_date") AS "last_date"
                 FROM   "ordering_article" U0
                 WHERE  ( U0."author_id" = ( "ordering_author"."id" )
                          AND Upper(U0."headline" :: text) LIKE Upper(
                              '%Article%') )
                 GROUP  BY U0."author_id")
FROM   "ordering_author"
ORDER  BY 5 DESC  

It might be that because CockroachDB has DatabaseFeatures.nulls_order_largest = False (unlike PostgreSQL), the loss of the second subquery in the ORDER BY is problematic.

timgraham avatar May 22 '22 01:05 timgraham

It might be that because CockroachDB has DatabaseFeatures.nulls_order_largest = False (unlike PostgreSQL), the loss of the second subquery in the ORDER BY is problematic.

This is probably a result of https://github.com/django/django/pull/15687/files#diff-f58de2deaccecd2d53199c5ca29e3e1050ec2adb80fb057cdfc0b4e6accdf14fR753-R769 but if it looses the second ORDER BY then this might be a problem in the linked code and not in cdb.

apollo13 avatar Jun 05 '22 08:06 apollo13

@timgraham I can reproduce your query issue when I set supports_order_by_nulls_modifier = False (which it probably is on old cockroachdb versions: https://github.com/cockroachdb/django-cockroachdb/blob/master/django_cockroachdb/features.py#L60 -- did you test against an old version?). That said, the combination of supports_order_by_nulls_modifier = False & supports_order_column_alias = True certainly shows there is a problem in the new code.

apollo13 avatar Jun 05 '22 08:06 apollo13

@timgraham I can reproduce your query issue when I set supports_order_by_nulls_modifier = False (which it probably is on old cockroachdb versions: https://github.com/cockroachdb/django-cockroachdb/blob/master/django_cockroachdb/features.py#L60 -- did you test against an old version?). That said, the combination of supports_order_by_nulls_modifier = False & supports_order_column_alias = True certainly shows there is a problem in the new code.

Yes, the failure is only present on older versions of CockroachDB.

timgraham avatar Jun 07 '22 18:06 timgraham

buildbot, test on oracle.

apollo13 avatar Jun 22 '22 12:06 apollo13

wow all tests passing! great work

auvipy avatar Jul 11 '22 07:07 auvipy

@timgraham Any chance you could test the latest changes against cockroach db?

apollo13 avatar Aug 08 '22 19:08 apollo13

Just tested this with CockroachDB on 1043b36. First impressions: Seems to work! Brilliant job.

Forgive my noobiness. This is a really exciting feature to me-- I'm really eager to see a release, so I wanted to post a report.

https://github.com/cockroachdb/django-cockroachdb Needs to be updated with psycopg3 @timgraham. Was able to get it running by changing the following lines:

  • In django_cockroachdb/base.py
import psycopg
#import psycopg2  # noqa
#import psycopg2.extensions  # noqa
#import psycopg2.extras  # noqa
  • In django_cockroachdb/operations.py
from psycopg import errors
#from psycopg2 import errorcodes
#.... I'm sure this will break, but I was just trying to get it running.
if (getattr(exc.__cause__, 'pgcode', '') != errors.SERIALIZATION_FAILURE or
#if (getattr(exc.__cause__, 'pgcode', '') != errorcodes.SERIALIZATION_FAILURE or
  • Commented out version check in django_cockroachdb/utils.py

image

gnat avatar Aug 16 '22 14:08 gnat

@apollo13 Will you tackle the PostGIS backend? (First issue: from psycopg2 import Binary in django/contrib/gis/db/backends/postgis/adapter.py)

timgraham avatar Aug 19 '22 17:08 timgraham

@timgraham At some point yes, but probably not before I have weeded out the remaining issues with bind params in the core backend.

apollo13 avatar Aug 19 '22 17:08 apollo13

Great! Your last commit here for EXTRACT params fixed failures on CockroachDB. Most of the remaining failures are in the following categories:

  • could not determine data type of placeholder $1
  • column "name" must appear in the GROUP BY clause or be used in an aggregate function
  • unsupported binary operator: <decimal> / <float> (desired <decimal>)

I'm guessing that first category is what you're referring to in your last comment.

timgraham avatar Aug 20 '22 01:08 timgraham

Yes, bullets 1 & 2 are expected to show up currently. I just pushed (a rather ugly) gis support commit.

apollo13 avatar Aug 20 '22 19:08 apollo13

Okay down to one failure that will probably require some work. Reviews welcome! GIS Tests pass also now.

apollo13 avatar Sep 24 '22 15:09 apollo13

Hi @apollo13 , what stage this is currently at? Maybe I can help you with something? See, my project needs this backend badly :)

pwtail avatar Oct 09 '22 13:10 pwtail

@pwtail Well code review here and testing your code against this branch with psycopg3 would be a massive help.

I'll rebase & update the branch over the next days (parts of the required changes have already be merged to main)

apollo13 avatar Oct 10 '22 06:10 apollo13

Thank you @apollo13 , already using your branch)

pwtail avatar Oct 10 '22 07:10 pwtail

Hi folks,

I have just rebased this PR to latest master. Now that all prerequisites for this PR are in further rebasing should get easier and not require manual intervention. I think the PR is starting to become solid, there are some areas though where I'd highly appreciate some help:

  • PostGIS in general. The adapter registration there is imo still rather ugly, I am open to creative ideas :)
  • Is my new handling of registering extension sound?
  • Do the new as_postgresql make sense everywhere?
  • Ideas on https://github.com/django/django/pull/15687/files#diff-530111bb812ef292d6db3ab0285e128bb585992d02d47dcca872d2e8c9b592daR586
  • Is compose used correctly and consistently or are there other variants still around in the codebase?

apollo13 avatar Oct 24 '22 18:10 apollo13

@apollo13 thanks for the rebase. I'm currently working on addressing the as_postgresql and Text(str) quirks by having JSONField.get_db_prep_value make use of psycopg.types.json.Jsonb instead and it's showing great results (down to a few remaining test failures).

I'll keep you updated!

charettes avatar Oct 24 '22 19:10 charettes

@apollo13 I confirm that locally tests are working with Python 3.11 , psycopg2 or psycopg version 3, PostgreSQL 14.5

pauloxnet avatar Oct 24 '22 20:10 pauloxnet

@charettes Uff that is great to hear. Btw I do have a custom branch where I have enabled github actions to test all combinations: https://github.com/apollo13/django/tree/psycopg-dev So once you have something I'll reapply it there to run the tests against all combinations.

apollo13 avatar Oct 26 '22 08:10 apollo13

We are not 100% there yet, ie Simon's changes will be coming in. But this PR is not in a reviewable state and I'd love to have more people look over it.

apollo13 avatar Oct 26 '22 15:10 apollo13

I haven't reviewed much of the code, but it's mostly working on CockroachDB. Mostly it's down to could not determine data type of placeholder errors which I think will be fixed by adding as_cockroachdb to alias the new as_postgresql methods.

On GIS, there a a bunch of cannot adapt type 'PostGISAdapter' using placeholder '%s' (format: AUTO). Glancing at the code, I haven't spotted the fix for this. (I'll dig in if something obvious doesn't come to mind.)

timgraham avatar Oct 27 '22 01:10 timgraham

After digging in, many of the errors about 'could not determine type of placeholder' are actually string arguments like:

SELECT "ordering_article"."id",
       "ordering_article"."author_id",
       "ordering_article"."second_author_id",
       "ordering_article"."headline",
       "ordering_article"."pub_date",
       '1' AS "constant"
FROM "ordering_article"
ORDER BY "constant" ASC,
         "ordering_article"."headline" DESC;

args=('1',);

fails with:

psycopg.errors.IndeterminateDatatype: could not determine data type of placeholder $1
HINT:  consider adding explicit type casts to the placeholder arguments

As far as I can tell, the same query is generated when running against PostgreSQL. Am I missing some initialization code or something? Could it be a problem with CockroachDB?

timgraham avatar Oct 28 '22 01:10 timgraham

@timgraham if you want this to work properly for CockroachDB you'd likely want to override Value.as_cockroachdb to add explicit casts based of self.output_field so annotating Value("1") turns into ('%s::text', '1').

This might not work properly until a form of https://github.com/apollo13/django/pull/3 lands here though as we've been supporting Value(json.dumps(1)) and expect the JSON type inference to just work and I believe we'll have to deprecate that.

charettes avatar Oct 28 '22 01:10 charettes

Thanks for looking. It seems to be some difference between PostgreSQL and CockroachDB then? If so, I'll try to confirm and then report to the Cockroach team to see if they can address it since they try to be compatible with PostgreSQL. If they can't, thanks for the tip about a workaround.

timgraham avatar Oct 28 '22 01:10 timgraham

buildbot, test on psycopg3.

felixxm avatar Nov 03 '22 09:11 felixxm

I added a temporary configuration to run tests with psycopg 3. You can trigger it with "buildbot, test on psycopg3."

felixxm avatar Nov 03 '22 10:11 felixxm