elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

Add generic fallback implementation for synthetic source

Open lkts opened this issue 9 months ago • 5 comments

This PR uses infrastructure from #107567 to implement a fallback implementation of synthetic source for field mappers that don't support it natively. In that case we will store source of such field as is in a separate stored field.

Pending work:

  • Tests for more fields that now support synthetic source
  • There are differences between native synthetic source that is usually generated using doc values and fallback. The fallback is "too precise", f.e. if field value is an empty array, there is no way to know that when using doc values. When source is stored as is, it will be precisely an empty array. That leads to assumptions done in synthetic source tests not matching reality. We may want to create a different set of tests or try to replicate the behavior.
  • Unification of copy_to handling. copy_to is currently not allowed on fields when synthetic source is used, should this be applied uniformly?
  • Documentation

Note that this PR does not cover nested as it is not a FieldMapper however it can be done using similar approach separately.

lkts avatar May 02 '24 22:05 lkts

Pinging @elastic/es-storage-engine (Team:StorageEngine)

elasticsearchmachine avatar May 02 '24 22:05 elasticsearchmachine

Hi @lkts, I've created a changelog YAML for you.

elasticsearchmachine avatar May 02 '24 22:05 elasticsearchmachine

Here is another interesting observation. Currently geo_shape mapper (all of them) defines a block loader that only works with _source. It does not look like that infrastructure supports synthetic source to me. So i think that means that while we can "seamlessly" support synthetic source for geo_shape we will end up breaking ESQL.

lkts avatar May 03 '24 20:05 lkts

I am thinking we should opt-in into this one field at a time by setting supportsSyntheticSourceNatively (maybe should rename this too).

lkts avatar May 03 '24 20:05 lkts

@elasticmachine update branch

lkts avatar May 13 '24 18:05 lkts

Looks good. I'm not sure if array fields are covered too, maybe add a few tests for it. It's fine to deal with them in a follow-up if needed, though.

kkrik-es avatar May 20 '24 12:05 kkrik-es

@elasticmachine update branch

lkts avatar May 21 '24 16:05 lkts