apm-server icon indicating copy to clipboard operation
apm-server copied to clipboard

add tbs storage limit and disk-related metrics

Open rubvs opened this issue 1 month ago • 4 comments

Motivation/summary

Expose a new set of metrics to enhance TBS observability. The metric fields in the index data are tested in this PR, while the mappings are tested in the corresponding linked PRs. Changes in the https://github.com/elastic/elasticsearch/pull/138131 PR are tested in conjunction with the changed in this PR.

See https://github.com/elastic/apm-server/issues/15533#issuecomment-3555233204 for the detailed overview.

Depends on PR:

  • https://github.com/elastic/elasticsearch/pull/138131
  • https://github.com/elastic/beats/pull/47709
  • https://github.com/elastic/integrations/pull/16029

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

Step 1: Ensure Elasticsearch & Kibana is running

Depends on https://github.com/elastic/elasticsearch/pull/138131 with updates to monitoring-beats.json.

  1. Build Docker image with changes
> cd elasticsearch

# Build the ES image with the added metric mappings.
> ./gradlew buildAarch64DockerImage --rerun-tasks
  1. Edit apm-server/docker-compose.yml and change the ES image.
elasticsearch:
 image: docker.elastic.co/elasticsearch/elasticsearch:9.3.0-custom-SNAPSHOT
 # ... rest of config
  1. Spin up the required services.
> cd apm-server
> docker-compose up elasticsearch kibana

Step 2: Create APM Server config

apm-server:
  host: "127.0.0.1:8200"

output.elasticsearch:
  enabled: true
  hosts: ["http://localhost:9200"]
  username: "admin"
  password: "changeme"

monitoring.enabled: true

monitoring.elasticsearch:
  protocol: "http"
  hosts: ["http://localhost:9200"]
  username: "admin"
  password: "changeme"

Step 3: Start APM Server

Run APM Server binary directly:

> cd apm-server
> ./apm-server -e -v -c apm-server.yml

Step 6: Verify Data

Verify data in index .monitoring-beats-7-*

GET .monitoring-beats-7-2025.11.14/_search
{
  "_source": ["beats_stats.metrics.apm-server.sampling.tail"],
  "query": {
    "exists": {
      "field": "beats_stats.metrics.apm-server.sampling.tail"
    }
  },
  "size": 1
}

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 31,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".monitoring-beats-7-2025.11.14",
        "_id": "ApzqgpoBYbFQC8_sQ5JQ",
        "_score": 1,
        "_source": {
          "beats_stats": {
            "metrics": {
              "apm-server": {
                "sampling": {
                  "tail": {
                    "storage": {
                      "value_log_size": 0,
                      "storage_limit": 0,
                      "disk_used": 309071659008,
                      "disk_total": 994662584320,
                      "disk_usage_threshold": 80,
                      "lsm_size": 8891
                    }
                  }
                }
              }
            }
          }
        }
      }
    ]
  }
}

Verify mappings for index monitoring-beats-7-*

GET .monitoring-beats-7-2025.11.18/_mapping?filter_path=**.storage

{
  ".monitoring-beats-7-2025.11.18": {
    "mappings": {
      "properties": {
        "beats_stats": {
          "properties": {
            "metrics": {
              "properties": {
                "apm-server": {
                  "properties": {
                    "sampling": {
                      "properties": {
                        "tail": {
                          "properties": {
                            "storage": {
                              "properties": {
                                "disk_total": {
                                  "type": "long"
                                },
                                "disk_usage_threshold": {
                                  "type": "long"
                                },
                                "disk_used": {
                                  "type": "long"
                                },
                                "lsm_size": {
                                  "type": "long"
                                },
                                "storage_limit": {
                                  "type": "long"
                                },
                                "value_log_size": {
                                  "type": "long"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Related issues

Part of https://github.com/elastic/apm-server/issues/15533

rubvs avatar Nov 14 '25 14:11 rubvs

:robot: GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

github-actions[bot] avatar Nov 14 '25 14:11 github-actions[bot]

This pull request does not have a backport label. Could you fix it @rubvs? 🙏 To fixup this pull request, you need to add the backport labels for the needed branches, such as:

  • backport-7.17 is the label to automatically backport to the 7.17 branch.
  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • backport-9./d is the label to automatically backport to the 9./d branch. /d is the digit.
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

mergify[bot] avatar Nov 14 '25 14:11 mergify[bot]

:green_heart: Build Succeeded

History

  • :broken_heart: Build #7279 failed 082a2e7659e23c2d22a39362b2b3bc415027992e
  • :broken_heart: Build #7277 failed 71a5fbbe2398b2c0ee4b61915c8f340e0a4dbf92

elasticmachine avatar Nov 20 '25 02:11 elasticmachine

I also ran the test following the steps posted by Ruben, and can confirm that the metrics mapping appears.

GET .monitoring-beats-7-2025.11.28/_mapping?filter_path=**.storage

{
  ".monitoring-beats-7-2025.11.28": {
    "mappings": {
      "properties": {
        "beats_stats": {
          "properties": {
            "metrics": {
              "properties": {
                "apm-server": {
                  "properties": {
                    "sampling": {
                      "properties": {
                        "tail": {
                          "properties": {
                            "storage": {
                              "properties": {
                                "disk_total": {
                                  "type": "long"
                                },
                                "disk_usage_threshold": {
                                  "type": "long"
                                },
                                "disk_used": {
                                  "type": "long"
                                },
                                "lsm_size": {
                                  "type": "long"
                                },
                                "storage_limit": {
                                  "type": "long"
                                },
                                "value_log_size": {
                                  "type": "long"
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

ericywl avatar Nov 28 '25 08:11 ericywl