heroic
heroic copied to clipboard
Fetching timeseries that contain resource identifiers can cause significant memory usage
When fetching a row from bigtable that contain resource identifiers it is unknown until query time how many rows will be accessed. This can cause a significant amount of memory usage and can cause an OOM event.
The below code currently fetches the rows via client.readRows
and then waits for all the rows to be fetched before performing any results.
https://github.com/spotify/heroic/blob/bbb8d4c02a2402b1fcfd103919dc61b9c68d07af/metric/bigtable/src/main/java/com/spotify/heroic/metric/bigtable/BigtableBackend.java#L443-L466
A possible solution might be to break up the fetch into smaller chunks.
This is partially related to https://github.com/spotify/heroic/issues/165 which would reduce the memory consumption when fetching a single row.