tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

Support adding fields to an existing index

Open PingXia-at opened this issue 3 years ago • 7 comments
trafficstars

Is your feature request related to a problem? Please describe. We need to add one or more fields to an existing index.

Without this feature, we need to re-index the entire index whenever a new field is added. Describe the solution you'd like support adding fields to an existing index. something like the ES update mapping API https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

PingXia-at avatar Oct 05 '22 13:10 PingXia-at

@PingXia-at can you confirm this is needed for tantivy and not quickwit?

(We have plans on how to handle schema changes in quickwit.)

fulmicoton avatar Oct 07 '22 05:10 fulmicoton

@fulmicoton yeah, we use tantivy not quickwit, because we want a search library within our Node process.

PingXia-at avatar Oct 07 '22 21:10 PingXia-at

Please add schema update support in tantivy, it's a common requirement.

gembin avatar Jan 02 '23 23:01 gembin

In elasticsearch we're using that dynamic template:

[
  {
    "text_prefix": {
      "match": "text_*",
      "mapping": {
        "type": "text"
      }
    }
  },
  {
    "keyword_prefix": {
      "match": "keyword_*",
      "mapping": {
        "type": "keyword"
      }
    }
  },
  {
    "boolean_prefix": {
      "match": "boolean_*",
      "mapping": {
        "type": "boolean"
      }
    }
  },
  {
    "datetime_prefix": {
      "match": "date_*",
      "mapping": {
        "type": "date",
        "format": "strict_date_time"
      }
    }
  },
  {
    "integer_prefix": {
      "match": "integer_*",
      "mapping": {
        "type": "integer"
      }
    }
  },
  {
    "float_prefix": {
      "match": "float_*",
      "mapping": {
        "type": "float"
      }
    }
  }
]

We could detect the need for adding a field by ourselves. Last time I've tried adding fields in the schema kind of worked with internal apis. An official api for that would still be good. The new json field might work for the same setup already, but having it explicit might be good.

marcbachmann avatar Mar 21 '23 11:03 marcbachmann

Duplicate of https://github.com/quickwit-oss/tantivy/issues/301

PSeitz avatar Mar 21 '23 13:03 PSeitz

Any plan to support this feature? sounds like it's a known issue for many years.

gembin avatar Mar 22 '23 06:03 gembin

Can you tell more about your use case? What happens to existing data if you add a new field to the schema? Would docs indexed before the schema change simply be missing in the field?

We also have JSON field now, which solves the issue for some use cases

PSeitz avatar Mar 22 '23 06:03 PSeitz