loudml
loudml copied to clipboard
match_all field applies filters to all features instead of just its own
This issue now tracks the issue where using match_all
in a feature applies the filters to all features when querying Elasticsearch. Original comment preserved below.
Is it (or will it be) possible to create and train a model for a single time series, and apply the prediction and anomaly detection to other time series?
For example, if I have create a model that studied the traffic from a single IP address over the course of a day, can I apply that same model to 1000 other IP addresses without having to individually create and train a new model for each (which would be resource heavy). This use case would apply if I knew the different IP addresses should produce very similar traffic patterns.
I tried to do this with the following features:
"features": {
"io": [
{
"field": "traffic",
"measurement": "log",
"metric": "avg",
"name": "traffic_pattern_to_analyse",
"match_all": [
{
"tag": "ip",
"value": "192.168.0.123"
}
]
}
],
"o": [
{
"field": "traffic",
"measurement": "log",
"metric": "avg",
"name": "traffic_pattern_to_predict",
"match_all": [
{
"tag": "id",
"value": "192.168.0.10"
}
]
}
]
}
I was thinking that the traffic_pattern_to_analyse
feature would be used as the input to the model, and the prediction result (and anomaly detection) would be applied to both as outputs. This didn't work because the Elasticsearch query uses the both match_all
filters on the entire query, so no results were returned.
Will this be possible in future, or maybe something like what the multi-metric feature in X-pack does where it splits a single time series into multiple time series based on a categorical field?
@daradermody
Hi Dara, it's a good catch thanks! 2 things in your comment:
- [ ] match_all bug, with Elasticsearch. You're right, the query should only apply to relevant feature where match_all is declared. We need to fix this.
- [ ] Reusing a model with other series: duplicate #36
Hey @regel, thanks for the clarification! #36 seems to be what I'm looking for, so I'll keep an eye on that. Do you want to keep this ticket open to track the match_all
bug, or create a separate ticket?
the match_all logic in elastic.py will have to be changed to solve this issue, and we need a new unit test into test_elastic.py for better test coverage in this scenario.
@daradermody feel free to submit a pull request