djorm-ext-pgfulltext icon indicating copy to clipboard operation
djorm-ext-pgfulltext copied to clipboard

The default index on VectorField is btree not gin

Open wolever opened this issue 10 years ago • 2 comments

At least on Postgres 9.2, the VectorField is created with a btree index, which can't be (as far as I can tell) used for optimizing full text search queries.

It doesn't look like this can be addressed at the Django level (their function for creating field indexes looks pretty hard coded), so maybe it would be good to have a method like SearchManager.create_search_field_index()? And bonus points for something sensible vis-a-vi South integration?

wolever avatar Mar 24 '14 22:03 wolever

Here's what I'm using for now:


    def create_search_field_index(self, using=None):
        using = using or self.db
        connection = connections[using]
        qn = connection.ops.quote_name
        field = self.search_field
        index_type = 'gin'

        tablespace_sql = ''
        tablespace = self.model._meta.db_tablespace
        if tablespace:
            tablespace_sql = connection.ops.tablespace_sql(tablespace)
            if tablespace_sql:
                tablespace_sql = ' ' + tablespace_sql

        self._execute_sql(connection,
            """
                CREATE INDEX {name} ON {table}
                USING {type}({column}){tablespace}
            """,
            name=qn(self._get_unquoted_search_index_name()),
            table=qn(self.model._meta.db_table),
            column=qn(field),
            type=index_type,
            tablespace=tablespace_sql,
        )

    def drop_search_field_index(self, using=None):
        using = using or self.db
        connection = connections[using]
        qn = connection.ops.quote_name
        self._execute_sql(connection,
            "DROP INDEX {name} ON {table}",
            name=qn(self._get_unquoted_search_index_name()),
            table=qn(self.model._meta.db_table),
        )

    def _execute_sql(self, connection, sql, **kwargs):
        cur = connection.cursor()
        cur.execute(sql.format(**kwargs))

    def _get_unquoted_search_index_name(self):
        return "%s_%s" %(
            self.model._meta.db_table,
            self.search_field,
        )

wolever avatar Mar 24 '14 23:03 wolever

As describe in PsotgreSQL doc http://www.postgresql.org/docs/9.4/static/textsearch-indexes.html a Gist/Gin index must be used to index ts_vector. The btree has antoher disavantage to can't contain huge size of text.

rodo avatar Jan 11 '15 18:01 rodo