anomaly-detection icon indicating copy to clipboard operation
anomaly-detection copied to clipboard

[BUG] Fail to resume Abnormal Detector after no/zero traffic period

Open gogovan-vincentngai opened this issue 4 years ago • 3 comments

Describe the bug I have setup a demo version via docker-compose in my local mac machine I also have setup nginx with filebeat to send demo log to my elasticsearch I setup a post man to keep send traffic to the demo nginx

Other plugins installed N/A

To Reproduce Steps to reproduce the behavior:

  1. I start the test and inject traffic at 11/23 11:00 to 12:00
  2. I create the Abnormal Detector (with index filebeat* )
  3. At 11/23 12:00 i stopped inject traffic and leave the setup there
  4. After some time later Abnormal Detector shows it Data is not being ingested correctly
  5. 11/24 09:00 i resume the traffic and Abnormal Detector does not resume , keep shows it Data is not being ingested correctly
  6. 11/24 10:09 i stop and start again the Abnormal Detector , it keep Shows Initializing and I can confirm the data are inject into over 30mins

Expected behavior

  • Resume the Abnormal Detector

Screenshots 11/23 12:00 Stop Traffic and Resume Send traffic at 11/24 09:00 螢幕截圖 2020-11-24 上午10 26 59 螢幕截圖 2020-11-24 上午9 32 29

Screen Even after 30mins stop and restart the Detector 螢幕截圖 2020-11-24 上午10 14 27

Desktop (please complete the following information): N/A

Additional context https://discuss.opendistrocommunity.dev/t/questions-about-ml/4175/5

gogovan-vincentngai avatar Nov 24 '20 06:11 gogovan-vincentngai

sorry for the late reply. A few questions: First, for step 6, what's your detection interval? When stopping detectors, previous models are erased and new training starts. That's why you see the detector goes back to initializing. We look back last 24 hours data or last 512 historical samples (depending on which one has more data points) for training. If you don't have enough data points (we need at least 128 shingles), then the training uses live data and has to wait for the live data to come. A shingle is a consecutive sequence of the most recent records. For example, a shingle of 8 records associated with an entry corresponds to a vector of the last 8 consecutive records received up to and including the entry. Do you have enough shingles in your history going back from 11/24 10:09?

Second, for step 5, we show the error If latest 3 shingle are missing. Could you check if the anomaly result index has feature data in the last 3 interval? The query will be like:

curl -X GET "localhost:9200/.opendistro-anomaly-results*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "detector_id": "your-detector-id"
        }
      }
    }
  },
  "size": 1,
  "sort": [
    {
      "data_end_time": {
        "order": "desc"
      }
    }
  ]
}
'

kaituo avatar Dec 07 '20 21:12 kaituo

I'm experiencing the same problem: Data is not being ingested correctly for feature: returnCode

Does this message have something to do with feature configuration? In my case, I was hoping to detect anomalies based on returnCode(there are above 100 variations) and this is my configuration:

  • returnCode is of type 'text'
  • field: returnCode.keyword
  • Aggregation method: value_count (I selected 'count')
  • Category field: None

When I set category field as returnCode.keyword, that message above didn't show up

mustardlove avatar Feb 02 '21 02:02 mustardlove

Do you have live data? You can verify it by issuing a query against the source index within [now-your detector interval, now].

kaituo avatar Feb 03 '21 21:02 kaituo