beats
beats copied to clipboard
[Metricbeat][Kubernetes Volume] Add pvc reference to distinguish ephemeral from persistent volumes
Proposed commit message
[Metricbeat][Kubernetes Volume] Add pvc reference to distinguish ephemeral from persistent volumes
Checklist
- [ ] My code follows the style guidelines of this project
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] I have made corresponding change to the default configuration files
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] I have added an entry in
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.
Author's Checklist
- [ ]
How to test this PR locally
Create a PVC and mount the PVC as a volume in a pod, e.g. (on GKE):
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-demo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard-rwo
---
kind: Pod
apiVersion: v1
metadata:
name: pod-demo
spec:
volumes:
- name: pvc-demo-vol
persistentVolumeClaim:
claimName: pvc-demo
containers:
- name: pod-demo
image: nginx
resources:
limits:
cpu: 10m
memory: 80Mi
requests:
cpu: 10m
memory: 80Mi
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: pvc-demo-vol
Deploy metricbeat and check that there exist documents in kubernetes.volume
with the field kubernetes.persistentvolumeclaim.name
.
Alternatively execute the test.
Related issues
- Fixes https://github.com/elastic/beats/issues/6977
Use cases
The use-case is to being able to monitor persistent volumes and ignoring ephemeral volumes for threshold alerts.
As of now there is no way to distinguish persistent from ephemeral volumes. However, the endpoint called by the module $NODE_URL/summary/stats
does actually report a pvcReference that allows to bind the pod's volume with a corresponding pvc.
Example Response:
...
"volume": [
{
"time": "2024-04-09T17:34:17Z",
"availableBytes": 31509590016,
"capacityBytes": 31526391808,
"usedBytes": 24576,
"inodesFree": 1966069,
"inodes": 1966080,
"inodesUsed": 11,
"name": "pvc-demo-vol",
"pvcRef": {
"name": "pvc-demo",
"namespace": "default"
}
},
{
"time": "2024-04-09T17:34:17Z",
"availableBytes": 83873792,
"capacityBytes": 83886080,
"usedBytes": 12288,
"inodesFree": 502853,
"inodes": 502862,
"inodesUsed": 9,
"name": "kube-api-access-h2cll"
}
]
...
Screenshots
Logs
This pull request does not have a backport label. If this is a bug or security fix, could you label this PR @herrBez? 🙏. For such, you'll need to label your PR with:
- The upcoming major version of the Elastic Stack
- The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)
To fixup this pull request, you need to add the backport labels for the needed branches, such as:
-
backport-v8./d.0
is the label to automatically backport to the8./d
branch./d
is the digit
:green_heart: Build Succeeded
the below badges are clickable and redirect to their specific view in the CI or DOCS
![]()
![]()
![]()
![]()
![]()
Expand to view the summary
Build stats
- Duration: 101 min 10 sec
:grey_exclamation: Flaky test report
No test was executed to be analysed.
:robot: GitHub comments
Expand to view the GitHub comments
To re-run your PR in the CI, just comment with:
-
/test
: Re-trigger the build. -
/package
: Generate the packages and run the E2E tests. -
/beats-tester
: Run the installation tests with beats-tester. -
run
elasticsearch-ci/docs
: Re-trigger the docs validation. (use unformatted text in the comment!)
cc @elastic/obs-ds-hosted-services
a new field also should be added in the fields file: https://github.com/elastic/beats/blob/main/metricbeat/module/kubernetes/volume/_meta/fields.yml
Hi,
I tried to address the raised point about the tests^^. I also tried specifying the field. What I am not sure about is that in the end I want to "reuse" kubernetes.persistentvolumeclaim.name
which is part of the kubernetes.state_persistentvolumeclaims
dataset. My doubt comes from the fact that we did not specify kubernetes.pod.name
. WDYT?
Apparently, adding the field broke the pipeline with the following error:
[2024-05-02T16:39:21.927Z] E Exception: export command returned with an error: Error generating Index Pattern: field <kubernetes.persistentvolumeclaim.name> is duplicated, remove it or set 'overwrite: true', {Name:name Type:keyword Description:PVC name. Format: Fields:[] MultiFields:[] Enabled:
Analyzer:{Name: Definition: } SearchAnalyzer:{Name: Definition: } Norms:false Dynamic:{Value: } Index: DocValues: CopyTo: IgnoreAbove:0 AliasPath: MigrationAlias:false Dimension: DynamicTemplate:false Unit: MetricType: ObjectType: ObjectTypeMappingType: ScalingFactor:0 ObjectTypeParams:[] Analyzed: Count:0 Searchable: Aggregatable: Script: Pattern: InputFormat: OutputFormat: OutputPrecision: LabelTemplate: UrlTemplate:[] OpenLinkInCurrentTab: Overwrite:false DefaultField: Path:kubernetes.persistentvolumeclaim.name}, {"aggregatable":true,"analyzed":false,"count":0,"doc_values":true,"indexed":true,"name":"kubernetes.persistentvolumeclaim.name","scripted":false,"searchable":true,"type":"string"}.
Apparently, adding the field broke the pipeline with the following error:
@herrBez in beats, the different kubernetes metricsets share the same index and mappings, so if the field is already declared, you don't need to declare it again. There is a duplicate key.
In integrations repo, when this field will be also added, then you need to add it kubernetes.volume
data stream as well. The reason is that in agent each data stream has its own index and hence mappings.
This pull request is now in conflicts. Could you fix it? 🙏 To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/
git fetch upstream
git checkout -b kubernetes-pvc-volume upstream/kubernetes-pvc-volume
git merge upstream/main
git push upstream kubernetes-pvc-volume
Hi Michael, I addressed the required changes and the build works fine
@herrBez Can you add a screenshot showing the new field being populated to Elasticsearch? You can use a view from discovery filtering by the kubernetes.volume metricset and showing the new field.
Here is the screenshot with the new version:
Can I go ahead and merge it? Should I open a PR in the integration repository to add the field to the kubernetes.volume datastream?
Can I go ahead and merge it? Should I open a PR in the integration repository to add the field to the kubernetes.volume datastream?
I will merge it. About the field in the integration, not yet. We do it after the release of beats. I will add this in our list cause we have some other fields to declare as well.