index-management icon indicating copy to clipboard operation
index-management copied to clipboard

[BUG] Incomplete search results against rollup index composed of data from multiple rollup jobs

Open sharebear opened this issue 11 months ago • 1 comments

Describe the bug Incomplete results when querying rollup index with date histogram.

To Reproduce

Enter the following into the dev console and execute each query in sequence (waiting for completion of rollup jobs at that step)

# Insert some test data

POST sharej-data-2023-07-25/_doc
{
  "timestamp": "2023-07-25T14:10:43.1",
  "numberOfCalls": 1
}

POST sharej-data-2023-07-25/_doc
{
  "timestamp": "2023-07-25T14:12:43.1",
  "numberOfCalls": 1
}

POST sharej-data-2023-07-24/_doc
{
  "timestamp": "2023-07-24T13:14:43.1",
  "numberOfCalls": 4
}

POST sharej-data-2023-07-23/_doc
{
  "timestamp": "2023-07-23T15:11:43.1",
  "numberOfCalls": 2
}

POST sharej-data-2023-07-23/_doc
{
  "timestamp": "2023-07-23T17:18:43.1",
  "numberOfCalls": 4
}

# Create rollup job for above indexes, emulating what happens when you have an ISM policy applying the rollup to each index after X days
PUT _plugins/_rollup/jobs/sharej-rollup-2023-07-23
{
  "rollup": {
    "enabled": true,
    "source_index": "sharej-data-2023-07-23",
    "target_index": "rollup-sharej-data-2023",
    "schedule": {
      "interval": {
        "start_time": 1,
        "period": "1",
        "unit": "Minutes"
      }
    },
    "description": "Test rollup",
    "page_size": 1000,
    "delay": 0,
    "continuous": false,
    "dimensions": [
      {
        "date_histogram": {
          "source_field": "timestamp",
          "fixed_interval": "1h",
          "timezone": "UTC"
        }
      }
    ],
    "metrics": [
      {
        "source_field": "numberOfCalls",
        "metrics": [
          {
            "avg": {}
          },
          {
            "sum": {}
          },
          {
            "max": {}
          },
          {
            "min": {}
          },
          {
            "value_count": {}
          }
        ]
      }
    ]
  }
}

PUT _plugins/_rollup/jobs/sharej-rollup-2023-07-24
{
  "rollup": {
    "enabled": true,
    "source_index": "sharej-data-2023-07-24",
    "target_index": "rollup-sharej-data-2023",
    "schedule": {
      "interval": {
        "start_time": 1,
        "period": "1",
        "unit": "Minutes"
      }
    },
    "description": "Test rollup",
    "page_size": 1000,
    "delay": 0,
    "continuous": false,
    "dimensions": [
      {
        "date_histogram": {
          "source_field": "timestamp",
          "fixed_interval": "1h",
          "timezone": "UTC"
        }
      }
    ],
    "metrics": [
      {
        "source_field": "numberOfCalls",
        "metrics": [
          {
            "avg": {}
          },
          {
            "sum": {}
          },
          {
            "max": {}
          },
          {
            "min": {}
          },
          {
            "value_count": {}
          }
        ]
      }
    ]
  }
}

PUT _plugins/_rollup/jobs/sharej-rollup-2023-07-25
{
  "rollup": {
    "enabled": true,
    "source_index": "sharej-data-2023-07-25",
    "target_index": "rollup-sharej-data-2023",
    "schedule": {
      "interval": {
        "start_time": 1,
        "period": "1",
        "unit": "Minutes"
      }
    },
    "description": "Test rollup",
    "page_size": 1000,
    "delay": 0,
    "continuous": false,
    "dimensions": [
      {
        "date_histogram": {
          "source_field": "timestamp",
          "fixed_interval": "1h",
          "timezone": "UTC"
        }
      }
    ],
    "metrics": [
      {
        "source_field": "numberOfCalls",
        "metrics": [
          {
            "avg": {}
          },
          {
            "sum": {}
          },
          {
            "max": {}
          },
          {
            "min": {}
          },
          {
            "value_count": {}
          }
        ]
      }
    ]
  }
}

# Watch status of rollup jobs until complete

GET _plugins/_rollup/jobs/sharej-rollup-2023-07-23/_explain

GET _plugins/_rollup/jobs/sharej-rollup-2023-07-24/_explain

GET _plugins/_rollup/jobs/sharej-rollup-2023-07-25/_explain

# Execute query against source data. Three buckets returned
GET sharej-data-2023-*/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggregations": {
    "by_day": {
      "date_histogram": {
        "field": "timestamp",
        "fixed_interval": "1d"
      },
      "aggregations": {
        "totalCalls": {
          "sum": {
            "field": "numberOfCalls"
          }
        }
      }
    }
  }
}

# Execute query against rollup data. Only 1 bucket returned!?!?!? Where's the rest of the data?
GET rollup-sharej-data-2023/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggregations": {
    "by_day": {
      "date_histogram": {
        "field": "timestamp",
        "fixed_interval": "1d"
      },
      "aggregations": {
        "totalCalls": {
          "sum": {
            "field": "numberOfCalls"
          }
        }
      }
    }
  }
}

# Execute against rollup data without query. Expected result again (but this isn't the query we get when adding a visualisation)
GET rollup-sharej-data-2023/_search
{
  "size": 0,
  "aggregations": {
    "by_day": {
      "date_histogram": {
        "field": "timestamp",
        "fixed_interval": "1d"
      },
      "aggregations": {
        "totalCalls": {
          "sum": {
            "field": "numberOfCalls"
          }
        }
      }
    }
  }
}

Expected behavior

All three queries at the end should return the same results. What appears to be happening is that the results from only one of the rollup jobs are returned when the query parameter is provided to the search against the rollup index.

Host/Environment (please complete the following information):

  • OS: Linux (discovered in Aiven hosted version but behaviour reproduced locally with docker image)
  • Version 2.8.0

Additional context We've got some metrics that we have posted to daily indexes. We have an ISM policy applied to the daily index pattern that after three days, performs a rollup to an annual index and deletes the source index. When trying to create visualisations based upon the rollup index we're getting strange results. When hand crafting a search against the rollup index I'm able to see that all the expected data is there, but when placing the equivalent query via a visualisation on a dashboard we're missing data. The difference between my hand-crafted search and the search from the dashboard is the presence of the query field that narrows down the time-frame and optionally drills down on other facets (not included in code example above). How do we get our visualisations to show all the data, or have I stubled upon a genuine bug here?

sharebear avatar Jul 25 '23 13:07 sharebear

Should we move this to https://github.com/opensearch-project/index-management ?

msfroh avatar Aug 16 '23 16:08 msfroh