elasticsearch-learning-to-rank
elasticsearch-learning-to-rank copied to clipboard
active_features returns non-active features in _ltrlog for SLTR query
When logging features, the 'active_features' field isn't respected in the SLTR query and features that are not specified in active_features
are returned in the _ltrlog
without score values.
Sometimes you might want to execute your query on a subset of the features rather than use all the ones specified in the model. In this case the features not specified in active_features list will not be scored upon. They will be marked as missing. You only need to specify the params applicable to the active_features. If you request a feature name that is not a part of the feature set assigned to that model the query will throw an error.
This is a bit confusing to use as I've noticed that scores are also not returned if there's no match for features specified as active_features
(see examples below). Would it make more sense to exclude features not specified in active_features from the "_ltrlog"
altogether, or is there a reason they are included?
Expected behaviour
When sending an SLTR query with a subset of active_features
specified, e.g.
{
"sltr" : {
"featureset" : "moviefeatureset",
"_name": "logged_featureset",
"active_features" : [
"title_query"
],
"params": {
"query_text": "First"
}
}
}
I expected to only see the specified feature title_query
returned in the _ltrlog
. Instead, the returned response contains all features in the featureset, without score
values for non-active features e.g.
Expected:
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
}
]
}
]
Actual:
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
},
{
"name": "description_query"
}
]
}
]
Steps to reproduce
The code to reproduce the issue can be found in a POC repo I've created here.
Index:
PUT /movies
{
"mappings": {
"properties": {
"title": { "type": "text" },
"description": { "type": "text" },
"year_released": { "type": "integer" }
}
}
}
POST /movies/_doc
{
"title": "First Blood",
"description": "First Blood is a 1982 American-Canadian action directed by Ted Kotcheff and co-written by and starring Sylvester Stallone as Vietnam War veteran John Rambo.",
"year_released": 1982
}
Create LTR index and featureset:
PUT /_ltr
POST /_ltr/_featureset/moviefeatureset
{
"featureset": {
"features": [
{
"name": "title_query",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"title": "{{query_text}}"
}
}
},
{
"name": "description_query",
"params": [
"query_text"
],
"template_language": "mustache",
"template": {
"match": {
"description": "{{query_text}}"
}
}
}
]
}
}
Case 1 - STLR query with all active_features
GET /movies/_search
{
"query": {
"bool": {
"filter" : [
{
"sltr" : {
"featureset" : "moviefeatureset",
"_name": "logged_featureset",
"active_features" : [
"title_query",
"description_query"
],
"params": {
"query_text": "First"
}
}
}
]
}
},
"ext": {
"ltr_log": {
"log_specs": {
"name": "log_entry1",
"named_query": "logged_featureset"
}
}
}
}
Returns:
...
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
},
{
"name": "description_query",
"value": 0.2876821
}
]
}
]
...
Case 2 - STLR query with single active feature
{
"query": {
"bool": {
"filter" : [
{
"sltr" : {
"featureset" : "moviefeatureset",
"_name": "logged_featureset",
"active_features" : [
"title_query"
],
"params": {
"query_text": "First"
}
}
}
]
}
},
"ext": {
"ltr_log": {
"log_specs": {
"name": "log_entry1",
"named_query": "logged_featureset"
}
}
}
}
Returns:
...
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
},
{
"name": "description_query"
}
]
}
]
...
Case 3 - Index doc without a description
POST /movies/_doc
{
"title": "First Blood",
"year_released": 1982
}
SLTR query with description_query
in active_features
:
{
"sltr" : {
"featureset" : "moviefeatureset",
"_name": "logged_featureset",
"active_features" : [
"title_query",
"description_query"
],
"params": {
"query_text": "First"
}
}
}
Returns:
...
"_ltrlog": [
{
"log_entry1": [
{
"name": "title_query",
"value": 0.2876821
},
{
"name": "description_query"
}
]
}
]
...