rally
rally copied to clipboard
Remove references to segments memory usage
Elasticsearch is about to stop reporting memory usage of segments via https://github.com/elastic/elasticsearch/pull/75274, including per-segment data structures like terms, points or doc values memory usage.
Rally currently uses these stats in a few places e.g. esrally/telemetry.py and esrally/metrics.py, let's stop collecting these stats?
Please let me know if I should wait for this to be addressed in Rally before merging the Elasticsearch change.
For the record, I tested how Rally behaved with these stats removed and it looks like it handled it gracefully. The output was a bit smaller than usual as Rally had dropped memory usage from it:
$ ./rally race --preserve-install --track=solutions/logs --track-repository=internal --track-params="number_of_replicas:0,raw_data_volume_per_day:5GB,wait_for_status:yellow" --track-revision=778a20b97ee5895e1c4717188f2208e43d7ccf52 --car=4gheap
Auto-updating Rally from origin
Fast-forwarded master to origin/master.
____ ____
/ __ \____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|\__,_/_/_/\__, /
/____/
[INFO] Preparing for race ...
[INFO] Racing on track [solutions/logs], challenge [logging-indexing] and car ['4gheap'] with version [8.0.0-SNAPSHOT].
Running insert-pipelines [100% done]
Running insert-ilm [100% done]
Running delete-all-datastreams [100% done]
Running delete-all-index-templates [100% done]
Running create-all-index-templates [100% done]
Running create-required-data-streams [100% done]
Running wait-for-datastreams [100% done]
Running bulk-index [100% done]
Running compression-stats [100% done]
------------------------------------------------------
_______ __ _____
/ ____(_)___ ____ _/ / / ___/_________ ________
/ /_ / / __ \/ __ `/ / \__ \/ ___/ __ \/ ___/ _ \
/ __/ / / / / / /_/ / / ___/ / /__/ /_/ / / / __/
/_/ /_/_/ /_/\__,_/_/ /____/\___/\____/_/ \___/
------------------------------------------------------
| Metric | Task | Value | Unit |
|---------------------------------------------------------------:|-----------------------------:|------------:|-------:|
| Cumulative indexing time of primary shards | | 40.3858 | min |
| Min cumulative indexing time across primary shards | | 0.0172333 | min |
| Median cumulative indexing time across primary shards | | 0.91105 | min |
| Max cumulative indexing time across primary shards | | 23.05 | min |
| Cumulative indexing throttle time of primary shards | | 0 | min |
| Min cumulative indexing throttle time across primary shards | | 0 | min |
| Median cumulative indexing throttle time across primary shards | | 0 | min |
| Max cumulative indexing throttle time across primary shards | | 0 | min |
| Cumulative merge time of primary shards | | 13.8162 | min |
| Cumulative merge count of primary shards | | 128 | |
| Min cumulative merge time across primary shards | | 0 | min |
| Median cumulative merge time across primary shards | | 0.06565 | min |
| Max cumulative merge time across primary shards | | 12.0313 | min |
| Cumulative merge throttle time of primary shards | | 6.97037 | min |
| Min cumulative merge throttle time across primary shards | | 0 | min |
| Median cumulative merge throttle time across primary shards | | 0 | min |
| Max cumulative merge throttle time across primary shards | | 6.45255 | min |
| Cumulative refresh time of primary shards | | 1.26883 | min |
| Cumulative refresh count of primary shards | | 226 | |
| Min cumulative refresh time across primary shards | | 0.00173333 | min |
| Median cumulative refresh time across primary shards | | 0.0445833 | min |
| Max cumulative refresh time across primary shards | | 0.6646 | min |
| Cumulative flush time of primary shards | | 1.42982 | min |
| Cumulative flush count of primary shards | | 84 | |
| Min cumulative flush time across primary shards | | 0.00355 | min |
| Median cumulative flush time across primary shards | | 0.0271833 | min |
| Max cumulative flush time across primary shards | | 0.898817 | min |
| Total Young Gen GC time | | 11.83 | s |
| Total Young Gen GC count | | 1161 | |
| Total Old Gen GC time | | 0 | s |
| Total Old Gen GC count | | 0 | |
| Store size | | 4.97561 | GB |
| Translog size | | 6.14673e-07 | GB |
| Segment count | | 138 | |
| Min Throughput | insert-pipelines | 14.11 | ops/s |
| Mean Throughput | insert-pipelines | 14.11 | ops/s |
| Median Throughput | insert-pipelines | 14.11 | ops/s |
| Max Throughput | insert-pipelines | 14.11 | ops/s |
| 100th percentile latency | insert-pipelines | 990.548 | ms |
| 100th percentile service time | insert-pipelines | 990.548 | ms |
| error rate | insert-pipelines | 0 | % |
| Min Throughput | insert-ilm | 27.04 | ops/s |
| Mean Throughput | insert-ilm | 27.04 | ops/s |
| Median Throughput | insert-ilm | 27.04 | ops/s |
| Max Throughput | insert-ilm | 27.04 | ops/s |
| 100th percentile latency | insert-ilm | 36.5293 | ms |
| 100th percentile service time | insert-ilm | 36.5293 | ms |
| error rate | insert-ilm | 0 | % |
| Min Throughput | delete-all-index-templates | 466.04 | ops/s |
| Mean Throughput | delete-all-index-templates | 466.04 | ops/s |
| Median Throughput | delete-all-index-templates | 466.04 | ops/s |
| Max Throughput | delete-all-index-templates | 466.04 | ops/s |
| 100th percentile latency | delete-all-index-templates | 31.9989 | ms |
| 100th percentile service time | delete-all-index-templates | 31.9989 | ms |
| error rate | delete-all-index-templates | 0 | % |
| Min Throughput | create-all-index-templates | 26.11 | ops/s |
| Mean Throughput | create-all-index-templates | 26.11 | ops/s |
| Median Throughput | create-all-index-templates | 26.11 | ops/s |
| Max Throughput | create-all-index-templates | 26.11 | ops/s |
| 100th percentile latency | create-all-index-templates | 574.15 | ms |
| 100th percentile service time | create-all-index-templates | 574.15 | ms |
| error rate | create-all-index-templates | 0 | % |
| Min Throughput | create-required-data-streams | 5.41 | ops/s |
| Mean Throughput | create-required-data-streams | 5.46 | ops/s |
| Median Throughput | create-required-data-streams | 5.46 | ops/s |
| Max Throughput | create-required-data-streams | 5.51 | ops/s |
| 50th percentile latency | create-required-data-streams | 187.788 | ms |
| 90th percentile latency | create-required-data-streams | 190.911 | ms |
| 100th percentile latency | create-required-data-streams | 191.227 | ms |
| 50th percentile service time | create-required-data-streams | 187.788 | ms |
| 90th percentile service time | create-required-data-streams | 190.911 | ms |
| 100th percentile service time | create-required-data-streams | 191.227 | ms |
| error rate | create-required-data-streams | 0 | % |
| Min Throughput | wait-for-datastreams | 639.47 | ops/s |
| Mean Throughput | wait-for-datastreams | 639.47 | ops/s |
| Median Throughput | wait-for-datastreams | 639.47 | ops/s |
| Max Throughput | wait-for-datastreams | 639.47 | ops/s |
| 50th percentile latency | wait-for-datastreams | 0.742331 | ms |
| 90th percentile latency | wait-for-datastreams | 0.811442 | ms |
| 100th percentile latency | wait-for-datastreams | 4.52598 | ms |
| 50th percentile service time | wait-for-datastreams | 0.742331 | ms |
| 90th percentile service time | wait-for-datastreams | 0.811442 | ms |
| 100th percentile service time | wait-for-datastreams | 4.52598 | ms |
| error rate | wait-for-datastreams | 0 | % |
| Min Throughput | bulk-index | 884.83 | docs/s |
| Mean Throughput | bulk-index | 32233.9 | docs/s |
| Median Throughput | bulk-index | 33330.4 | docs/s |
| Max Throughput | bulk-index | 33871.5 | docs/s |
| 50th percentile latency | bulk-index | 161.223 | ms |
| 90th percentile latency | bulk-index | 406.313 | ms |
| 99th percentile latency | bulk-index | 572.411 | ms |
| 99.9th percentile latency | bulk-index | 1131.62 | ms |
| 99.99th percentile latency | bulk-index | 1651.57 | ms |
| 100th percentile latency | bulk-index | 1787.46 | ms |
| 50th percentile service time | bulk-index | 161.223 | ms |
| 90th percentile service time | bulk-index | 406.313 | ms |
| 99th percentile service time | bulk-index | 572.411 | ms |
| 99.9th percentile service time | bulk-index | 1131.62 | ms |
| 99.99th percentile service time | bulk-index | 1651.57 | ms |
| 100th percentile service time | bulk-index | 1787.46 | ms |
| error rate | bulk-index | 0 | % |
| Min Throughput | compression-stats | 0.21 | ops/s |
| Mean Throughput | compression-stats | 0.36 | ops/s |
| Median Throughput | compression-stats | 0.26 | ops/s |
| Max Throughput | compression-stats | 0.84 | ops/s |
| 50th percentile latency | compression-stats | 18453.7 | ms |
| 90th percentile latency | compression-stats | 70295.3 | ms |
| 100th percentile latency | compression-stats | 104852 | ms |
| 50th percentile service time | compression-stats | 18453.7 | ms |
| 90th percentile service time | compression-stats | 70295.3 | ms |
| 100th percentile service time | compression-stats | 104852 | ms |
| error rate | compression-stats | 9.09 | % |
[WARNING] Error rate is 9.09 for operation 'compression-stats'. Please check the logs.
[INFO] Preserving benchmark candidate installation at [/home/jpountz/.rally/benchmarks/races/62dd094c-22f4-420f-ab96-c1679c3b2b2e/rally-node-0/install/elasticsearch-8.0.0-SNAPSHOT].
----------------------------------
[INFO] SUCCESS (took 1234 seconds)
----------------------------------
And the log contained the following lines:
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,memory_in_bytes]. Returning default value [None]
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,doc_values_memory_in_bytes]. Returning default value [None]
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,stored_fields_memory_in_bytes]. Returning default value [None]
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,terms_memory_in_bytes]. Returning default value [None]
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,norms_memory_in_bytes]. Returning default value [None]
2021-07-20 14:18:51,358 ActorAddr-(T|:39093)/PID:140875 esrally.telemetry WARNING Could not determine value at path [segments,points_memory_in_bytes]. Returning default value [None]
Thanks for the issue! Given that Rally handles this gracefully already there is no immediate urgency to remove it. This also means that https://github.com/elastic/elasticsearch/pull/75274 can be merged at any time. :)