djorm-ext-pgfulltext icon indicating copy to clipboard operation
djorm-ext-pgfulltext copied to clipboard

VectorField has db_index=True sometimes resulting in "index row size XXXX exceeds maximum 2712"

Open cpbotha opened this issue 10 years ago • 5 comments
trafficstars

This is definitely related to https://github.com/djangonauts/djorm-ext-pgfulltext/issues/18

In testing with a bunch of sample documents and converting their contents with to_tsvector() into a VectorField I saw the following error:

OperationalError: index row size 2872 exceeds maximum 2712 for index "blabla_tsv"
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

db_index=True in the VectorField class results in Django trying to create a "normal" index on the tsvector, instead of GIN or GIST.

Suggestions for work-arounds are most welcome!

cpbotha avatar Jan 21 '15 16:01 cpbotha

I'm experiencing the same issue. What I've done is implemented my own "VectorField" with db_index set to False (within the init), and manually creating the GIN/GiST index (usually I add this to the operations array that makemigrations creates when creating the search field.).

from django.db import models, migrations
class Migration(migrations.Migration):

    dependencies = []

    operations = [
        migrations.RunSQL(
            "CREATE INDEX tablename_searchidx ON tablename USING gin(search_index);",
        ),
    ]

jhgg avatar Jan 28 '15 20:01 jhgg

Here's a pull request that simply does NOT set db_index, leaving it to the user to decide: https://github.com/djangonauts/djorm-ext-pgfulltext/pull/46

cpbotha avatar Feb 16 '15 14:02 cpbotha

Hey folks, new maintainer here. So we've gone ahead updated the VectorField to support db_index=False values. That should prevent the ORM from creating the incorrect index. https://github.com/linuxlewis/djorm-ext-pgfulltext/commit/db8c220f0476bb6823204b9074909b33e844d9a4

We'll definately want to update the code so when db_index=True the correct index is set

linuxlewis avatar Mar 13 '15 18:03 linuxlewis

@linuxlewis Why the 0.10 on pypi has brought kwargs['db_index'] = True back and make this problem occurs again?

ChiChou avatar May 15 '16 12:05 ChiChou

Regarding the 0.10 on pypi, please note 0.10 is older than 0.9.X. It should probably have been 0.1.0::

tag 0.10   Latest commit 111a5ac  on May 22, 2014
tag 0.9.3  Latest commit 2ffd9c9  on Mar 31, 2015

Also note that since Django 1.10 there is this which may be helpful: https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/search/#django.contrib.postgres.search.SearchVectorField

daniel5gh avatar Aug 23 '16 12:08 daniel5gh