vespa
vespa copied to clipboard
Invalid package when indexing expression `get_field` is used in a search definition
Vespa version: 6.288.1 Steps to reproduce:
- Use
get_fieldas indexing expression in a field's indexing statement (view sample search definition) - Upload application package to a Vespa cluster (Local Docker image)
Expected result:
- Package is uploaded and indexing expression is used on new feed documents Actual result:
- Error received when validating package:
{
"error-code": "INVALID_APPLICATION_PACKAGE",
"message": "Invalid application package: default.my-cluster: Error loading model: For search 'sandbox', field 'one_id': For expression 'get_field address_id': Field 'address_id' not found."
}
Notes: I've looked at the source code of this feature and reviewed the test cases to make sure I'm using the expression as intended: https://github.com/vespa-engine/vespa/blob/vespa-6.288.1-1/indexinglanguage/src/test/java/com/yahoo/vespa/indexinglanguage/expressions/GetFieldTestCase.java Unless I'm missing something, I believe there is an issue when using this expression, any help/tips on this will be appreciated.
Sample SearchDefinition:
search sandbox {
document sandbox {
field some_name type string {
indexing: summary | index
}
struct struct_address {
field address_id type string {}
field coordinates type position {}
}
field one_address type struct_address {}
field all_addressses type array<struct_address> {}
}
field one_id type string {
indexing: input one_address | get_field address_id | summary | attribute
}
}
Thanks for the clear description. Assigning to the right person.
Note that you can skip the get_field instruction and just to
input one_adress.address_id | summary | attribute
Thanks for you reply, I tried your suggestion but it still fails, now with this error:
{
"error-code": "INVALID_APPLICATION_PACKAGE",
"message": "Invalid application package: default.my-cluster: Error loading model: Could not parse search definition file 'searchdefinitions/sandbox.sd': Error reported by IL parser: Encountered \" <IDENTIFIER> \"address_id \"\" at line 17, column 35.\nWas expecting one of:\n <INTEGER> ...\n <LONG> ...\n <DOUBLE> ...\n <FLOAT> ...\n \"+\" ...\n \"-\" ...\n \"{\" ...\n \"(\" ...\n <STRING> ...\n \"attribute\" ...\n \"base64decode\" ...\n \"base64encode\" ...\n \"clear_state\" ...\n \"echo\" ...\n \"exact\" ...\n \"flatten\" ...\n \"for_each\" ...\n \"get_field\" ...\n \"get_var\" ...\n \"guard\" ...\n \"hexdecode\" ...\n \"hexencode\" ...\n \"hostname\" ...\n \"if\" ...\n \"index\" ...\n \"input\" ...\n \"join\" ...\n \"lowercase\" ...\n \"ngram\" ...\n \"normalize\" ...\n \"now\" ...\n \"optimize_predicate\" ...\n \"passthrough\" ...\n \"random\" ...\n \"select_input\" ...\n \"set_language\" ...\n \"set_var\" ...\n \"split\" ...\n \"substring\" ...\n \"summary\" ...\n \"switch\" ...\n \"this\" ...\n \"tokenize\" ...\n \"to_array\" ...\n \"to_byte\" ...\n \"to_double\" ...\n \"to_float\" ...\n \"to_int\" ...\n \"to_long\" ...\n \"to_pos\" ...\n \"to_string\" ...\n \"to_wset\" ...\n \"trim\" ...\n \"zcurve\" ...\n \nAt position:\n indexing: input one_address.address_id | summary | attribute\n ^: Encountered \" <IDENTIFIER> \"address_id \"\" at line 17, column 35.\nWas expecting one of:\n <INTEGER> ...\n <LONG> ...\n <DOUBLE> ...\n <FLOAT> ...\n \"+\" ...\n \"-\" ...\n \"{\" ...\n \"(\" ...\n <STRING> ...\n \"attribute\" ...\n \"base64decode\" ...\n \"base64encode\" ...\n \"clear_state\" ...\n \"echo\" ...\n \"exact\" ...\n \"flatten\" ...\n \"for_each\" ...\n \"get_field\" ...\n \"get_var\" ...\n \"guard\" ...\n \"hexdecode\" ...\n \"hexencode\" ...\n \"hostname\" ...\n \"if\" ...\n \"index\" ...\n \"input\" ...\n \"join\" ...\n \"lowercase\" ...\n \"ngram\" ...\n \"normalize\" ...\n \"now\" ...\n \"optimize_predicate\" ...\n \"passthrough\" ...\n \"random\" ...\n \"select_input\" ...\n \"set_language\" ...\n \"set_var\" ...\n \"split\" ...\n \"substring\" ...\n \"summary\" ...\n \"switch\" ...\n \"this\" ...\n \"tokenize\" ...\n \"to_array\" ...\n \"to_byte\" ...\n \"to_double\" ...\n \"to_float\" ...\n \"to_int\" ...\n \"to_long\" ...\n \"to_pos\" ...\n \"to_string\" ...\n \"to_wset\" ...\n \"trim\" ...\n \"zcurve\" ...\n \nAt position:\n indexing: input one_address.address_id | summary | attribute\n ^: Error reported by IL parser: Encountered \" <IDENTIFIER> \"address_id \"\" at line 17, column 35.\nWas expecting one of:\n <INTEGER> ...\n <LONG> ...\n <DOUBLE> ...\n <FLOAT> ...\n \"+\" ...\n \"-\" ...\n \"{\" ...\n \"(\" ...\n <STRING> ...\n \"attribute\" ...\n \"base64decode\" ...\n \"base64encode\" ...\n \"clear_state\" ...\n \"echo\" ...\n \"exact\" ...\n \"flatten\" ...\n \"for_each\" ...\n \"get_field\" ...\n \"get_var\" ...\n \"guard\" ...\n \"hexdecode\" ...\n \"hexencode\" ...\n \"hostname\" ...\n \"if\" ...\n \"index\" ...\n \"input\" ...\n \"join\" ...\n \"lowercase\" ...\n \"ngram\" ...\n \"normalize\" ...\n \"now\" ...\n \"optimize_predicate\" ...\n \"passthrough\" ...\n \"random\" ...\n \"select_input\" ...\n \"set_language\" ...\n \"set_var\" ...\n \"split\" ...\n \"substring\" ...\n \"summary\" ...\n \"switch\" ...\n \"this\" ...\n \"tokenize\" ...\n \"to_array\" ...\n \"to_byte\" ...\n \"to_double\" ...\n \"to_float\" ...\n \"to_int\" ...\n \"to_long\" ...\n \"to_pos\" ...\n \"to_string\" ...\n \"to_wset\" ...\n \"trim\" ...\n \"zcurve\" ...\n \nAt position:\n indexing: input one_address.address_id | summary | attribute\n ^: Encountered \" <IDENTIFIER> \"address_id \"\" at line 17, column 35.\nWas expecting one of:\n <INTEGER> ...\n <LONG> ...\n <DOUBLE> ...\n <FLOAT> ...\n \"+\" ...\n \"-\" ...\n \"{\" ...\n \"(\" ...\n <STRING> ...\n \"attribute\" ...\n \"base64decode\" ...\n \"base64encode\" ...\n \"clear_state\" ...\n \"echo\" ...\n \"exact\" ...\n \"flatten\" ...\n \"for_each\" ...\n \"get_field\" ...\n \"get_var\" ...\n \"guard\" ...\n \"hexdecode\" ...\n \"hexencode\" ...\n \"hostname\" ...\n \"if\" ...\n \"index\" ...\n \"input\" ...\n \"join\" ...\n \"lowercase\" ...\n \"ngram\" ...\n \"normalize\" ...\n \"now\" ...\n \"optimize_predicate\" ...\n \"passthrough\" ...\n \"random\" ...\n \"select_input\" ...\n \"set_language\" ...\n \"set_var\" ...\n \"split\" ...\n \"substring\" ...\n \"summary\" ...\n \"switch\" ...\n \"this\" ...\n \"tokenize\" ...\n \"to_array\" ...\n \"to_byte\" ...\n \"to_double\" ...\n \"to_float\" ...\n \"to_int\" ...\n \"to_long\" ...\n \"to_pos\" ...\n \"to_string\" ...\n \"to_wset\" ...\n \"trim\" ...\n \"zcurve\" ...\n \nAt position:\n indexing: input one_address.address_id | summary | attribute\n ^"
}
Reply to self, I managed to make it work by using quotes:
indexing: input one_address."address_id" | summary | attribute
Reply to my reply: That didn't worked, seems like it did but in reality was just indexing the .to_string version of the Struct, and concatenating the field name. Back to square 1.