ml-commons
ml-commons copied to clipboard
[BUG] IgnoreMissing in ml inference search request processors doesn't accept optional model input
Current Implementation:
ML inference search request and response processors are the only search processors with an ignoreMissing flag. In the current implementation:
All fields in input_maps and out_maps are treated as required, similar to text_embedding processors. When a field is missing:
- If ignoreMissing is false (default), an IllegalArgumentException is thrown.
- If ignoreMissing is true, processing is skipped, returning the original query or search hit.
Pros of Current Implementation:
- Straightforward and predictable behavior
- Efficient for scenarios where inference is needed only for specific fields. This is particular helpful, when the document has a lot of fields, for example 100 fields, but only 1 field
nameis for model inference. Users can bind this processor with the index, only call model inference when this fieldnameis present in the query and match the json path, when querying other 99 fields, the processor is skipped, and no prediction task is wasted. - Prevents unnecessary processing when critical fields are absent
Cons of Current Implementation:
- Lacks support for optional inputs, which is crucial for multi-model scenarios
- Inconsistent with ML inference ingest processors, potentially causing confusion
Proposed Solutions:
Proposal 1: Align Search Processors with Ingest Processors
Allow absent fields in inputMaps when ignoreMissing is true Proceed with model predictions even with missing inputs
Pros:
- Consistent with ingest processor behavior
- Supports optional inputs in predictions
Cons:
- May lead to unnecessary prediction attempts, potentially wasting resources
Proposal 2: Introduce New Flags for Optional Fields
Add optionalInputs and optionalOutputs flags Maintain current ignoreMissing logic Process predictions when at least one required input is present
Pros:
- Offers greater flexibility and control
- Maintains backward compatibility
- Efficient resource utilization
- Supports multi-model scenarios
Cons:
- Adds complexity to processor configuration
- Recommendation: Implement Proposal 2 to provide a more flexible and efficient solution while maintaining backward compatibility. This approach addresses the current limitations and accommodates future needs for complex input configurations in multi-model scenarios.
Additional Context:
Original issue: https://github.com/opensearch-project/ml-commons/issues/3211