zombodb
zombodb copied to clipboard
Highlight function confuses boundary_max_scan and boundary_scan_max
ZomboDB version: tag v3000.0.3 Postgres version: 13.4 Elasticsearch version: 7.15.1
Problem Description:
Use of the highlighting function to set the boundary max characters causes an error when trying to set a boundary scan max, potentially due to a confusion between boundary_max_scan (https://www.elastic.co/guide/en/elasticsearch/reference/current/highlighting.html#highlighting-settings) and boundary_scan_max (https://github.com/zombodb/zombodb/blob/master/SCORING-HIGHLIGHTING.md?plain=1#L100)
Error Message (if any):
tutorial=# SELECT zdb.highlight(ctid, 'long_description') from products where products ==> 'wooden or person';
highlight
--------------------------------------------------------------------------------------------------
{"Throw it at a <em>person</em> with a big <em>wooden</em> stick and hope they don't hit it"}
{"A <em>wooden</em> container that will eventually rot away. Put stuff it in (but not a cat)."}
(2 rows)
tutorial=# SELECT zdb.highlight(ctid, 'long_description', zdb.highlight(boundary_max_scan=>20)) from products where products ==> 'wooden or person';
ERROR: function zdb.highlight(boundary_max_scan => integer) does not exist
LINE 1: SELECT zdb.highlight(ctid, 'long_description', zdb.highlight...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
tutorial=# SELECT zdb.highlight(ctid, 'long_description', zdb.highlight(boundary_scan_max=>20)) from products where products ==> 'wooden or person';
ERROR: HTTP 400 {
"error": {
"root_cause": [
{
"type": "x_content_parse_exception",
"reason": "[1:446] [highlight_field] unknown field [boundary_scan_max] did you mean any of [boundary_scanner, boundary_scanner_locale, boundary_max_scan, boundary_chars]?"
}
],
"type": "x_content_parse_exception",
"reason": "[1:466] [highlight] failed to parse field [fields]",
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[1:466] [fields] failed to parse field [long_description]",
"caused_by": {
"type": "x_content_parse_exception",
"reason": "[1:446] [highlight_field] unknown field [boundary_scan_max] did you mean any of [boundary_scanner, boundary_scanner_locale, boundary_max_scan, boundary_chars]?"
}
}
},
"status": 400
}
CONTEXT: /root/.cargo/registry/src/github.com-1ecc6299db9ec823/pgx-pg-sys-0.1.21/src/pg13.rs:43160:1
Table Schema/Index Definition:
db configured as described in (https://github.com/zombodb/zombodb/blob/master/TUTORIAL.md)
Output from select zdb.index_mapping('index_name');:
{"43536.2200.44267.44279": {"mappings": {"properties": {"id": {"type": "long"}, "name": {"type": "text", "copy_to": ["zdb_all"], "analyzer": "zdb_standard", "fiel
ddata": true, "index_prefixes": {"max_chars": 5, "min_chars": 2}}, "price": {"type": "long"}, "zdb_all": {"type": "text", "analyzer": "zdb_all_analyzer"}, "keyword
s": {"type": "keyword", "copy_to": ["zdb_all"], "normalizer": "lowercase", "ignore_above": 10922}, "zdb_cmax": {"type": "integer"}, "zdb_cmin": {"type": "integer"}
, "zdb_ctid": {"type": "long"}, "zdb_xmax": {"type": "long"}, "zdb_xmin": {"type": "long"}, "discontinued": {"type": "boolean"}, "short_summary": {"type": "text",
"copy_to": ["zdb_all"], "analyzer": "zdb_standard", "fielddata": true, "index_prefixes": {"max_chars": 5, "min_chars": 2}}, "inventory_count": {"type": "integer"},
"long_description": {"type": "text", "analyzer": "fulltext", "fielddata": true}, "zdb_aborted_xids": {"type": "long"}, "availability_date": {"type": "keyword", "f
ields": {"date": {"type": "date"}}, "copy_to": ["zdb_all"]}}, "date_detection": false, "dynamic_templates": [{"strings": {"mapping": {"type": "keyword", "copy_to":
"zdb_all", "normalizer": "lowercase", "ignore_above": 10922}, "match_mapping_type": "string"}}, {"dates_times": {"mapping": {"type": "keyword", "fields": {"date":
{"type": "date", "format": "strict_date_optional_time||epoch_millis||HH:mm:ss.S||HH:mm:ss.SX||HH:mm:ss.SS||HH:mm:ss.SSX||HH:mm:ss.SSS||HH:mm:ss.SSSX||HH:mm:ss.SSS
S||HH:mm:ss.SSSSX||HH:mm:ss.SSSSS||HH:mm:ss.SSSSSX||HH:mm:ss.SSSSSS||HH:mm:ss.SSSSSSX"}}, "copy_to": "zdb_all"}, "match_mapping_type": "date"}}, {"objects": {"mapp
ing": {"type": "nested", "include_in_parent": true}, "match_mapping_type": "object"}}], "numeric_detection": false}}}
Other Discussion:
I realize this issue is nearly a year old now, but is the complaint here simply that we accidentally spelled boundary_max_scan (what ES actually wants) as boundary_scan_max?
If so, I'll fix it.