aci-exporter icon indicating copy to clipboard operation
aci-exporter copied to clipboard

label regex not executed on child class

Open camrossi opened this issue 2 years ago • 4 comments

I am trying to extract the EPG to Port mapping with this code

  epg_to_port:
    class_name: vlanCktEp
    query_parameter: '?rsp-subtree-include=required&rsp-subtree-class=l2RsPathDomAtt&rsp-subtree=children'
    metrics:
      - name: dynamic_binding
        value_name: vlanCktEp.attributes.pcTag
        type: gauge
    labels:
      - property_name: vlanCktEp.attributes.epgDn
        regex: "^uni/tn-(?P<tenant>.*)/ap-(?P<app>.*)/epg-(?P<epg>.*)"
      - property_name: vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn
        regex: "^topology/pod-(?P<podid>[1-9][0-9]*)/node-(?P<nodeid>[1-9][0-9]+)/sys/conng/path-\\[(?P<interface>[^\\]]+)\\]"

But it seems the vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn label is not processed.

I found a way to make it work by passing the child in the metric. For example if I use this in the metrics value_name vlanCktEp.children.[l2RsPathDomAtt].attributes.parentSKey then the vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn label is added as well as vlanCktEp.attributes.epgDn

Is this expected?

camrossi avatar Jul 13 '23 05:07 camrossi

Hi @camrossi. I have some problem running the query, only have the cisco sandbox and it does not return any data on this query, but I would say that is the expected. The metrics you ask for vlanCktEp.attributes.pcTag is on the "highest" level of the data that is returned. Since your value_name is not related to an array the children data is never parsed. That means that you do not need to have the query parameter to include the subtree. From a label perspective it does not really make sense, especially not the interface name. Which interface name would be on the metric? If we have 100 children, which means 100 interfaces, what interface as a label should be used? The podid and nodeid would probably be the same for all 100 children, but requesting all this data and parsing it just to get the podid and nodeid do not make sense. That could probably be done by using a pormql join. If you want metrics from the children your example using vlanCktEp.children.[l2RsPathDomAtt].attributes.parentSKey as value_name will trigger that each child object is parsed. If you need to review the response of the query and follow the processing of the returned data you can put a breakpoint in the function func (c AciConnection) getByClassQuery(class string, query string) (string, error) in aci_connection.go

thenodon avatar Jul 13 '23 14:07 thenodon

Hi @thenodon, I see what you mean. Let me explain what is the goal here and how I make it work for now

At times understanding what EPG/VLAN is deployed on which leaf and port is not super easy and also tracking the deployment over time might be a challenge. So my idea was as following:

  • the vlanCktEp contains the mapping of an EPG to a VLAN for example (filtered output) this tells me that I have an EPG called uni/tn-ocp_sr_iov/ap-netop-ocp_sr_iov/epg-default-sriov-aci-cni-eno10 that is mapped to vlan-102 this can be either a static binding or a dynamic binding done via VMM integration
"vlanCktEp": {
    "attributes": {
        "adminSt": "active",
        "dn": "topology/pod-1/node-203/sys/ctx-[vxlan-2457601]/bd-[vxlan-14778374]/vlan-[vlan-102]",
        "encap": "vlan-102",
        "epgDn": "uni/tn-ocp_sr_iov/ap-netop-ocp_sr_iov/epg-default-sriov-aci-cni-eno10",
        "fabEncap": "vxlan-18492",
        "hwId": "95",
        "id": "111",
        "name": "ocp_sr_iov:netop-ocp_sr_iov:default-sriov-aci-cni-eno10",
        "pcTag": "49155",
    }
}
  • the Child class l2RsPathDomAtt tells me the ports where that EPG is deployed to for example for epg-default-sriov-aci-cni-eno10 I get 1 child (but is 1 child per port where the EPG is deployed)
{
    "totalCount": "1",
    "imdata": [
        {
            "l2RsPathDomAtt": {
                "attributes": {
                    "dn": "topology/pod-1/node-203/sys/ctx-[vxlan-2457601]/bd-[vxlan-14778374]/vlan-[vlan-102]/rspathDomAtt-[topology/pod-1/node-203/sys/conng/path-[eth1/34]]",
                    "operStQual": "unspecified",
                    "parentSKey": "111",
                    "tDn": "topology/pod-1/node-203/sys/conng/path-[eth1/34]",
                    "tSKey": "eth1/34",
                }
            }
        }
    ]
}

So from the above I know that is on node-203 eth1/34

So my idea is to build a table to summarise this information and I thought a smart metric to use for this would have been the pcTAG that identify the EPG within a VRF/Scope as the Metric.

But clearly this seems not to be working. I re-wrote the query like this now:

      dynamic_binding_info:
        class_name: vlanCktEp
        query_parameter: '?rsp-subtree-include=required&rsp-subtree-class=l2RsPathDomAtt&rsp-subtree=children'
        metrics:
          - name: dynamic_binding
            value_name: vlanCktEp.children.[l2RsPathDomAtt].attributes.operSt
            type: gauge
            value_transform:
              'unknown': 0
              'down': 1
              'up': 2
              'link-up': 3
        labels:
          - property_name: vlanCktEp.attributes.epgDn
            regex: "^uni/tn-(?P<tenant>.*)/ap-(?P<app>.*)/epg-(?P<epg>.*)"
          - property_name: vlanCktEp.attributes.encap
            regex: "^vlan-(?P<vlan>.*)"
          - property_name: vlanCktEp.attributes.pcTag
            regex: "^(?P<pcTag>.*)"
          - property_name: vlanCktEp.children.[l2RsPathDomAtt].attributes.tDn
            regex: "^topology/pod-(?P<podid>[1-9][0-9]*)/node-(?P<nodeid>[1-9][0-9]+)/sys/conng/path-\\[(?P<interface>[^\\]]+)\\]"

And it works pretty well: I can get two important infos here:

  1. The EPG to port to VLAN mapping that I can filter across any fabric/EPG image

  2. A time line on which port is used by which EPG, still figuring out which visualisation is better but here is an example when I move a VM from one host to another and you can see the ports are going from 203-204 1/1 to 203-204 1/2 image

Hope this make sense, do you think the way I run the query now is correct then ?

camrossi avatar Jul 13 '23 23:07 camrossi

Hi @camrossi and thanks for your detailed answer. Learned a lot more about epg's and vlans. And yes I think you now have the right query as long as it give you the response you expect :). When using the array notation the value_name should be in the child data, but labels can come from the parent, exactly as you have done it. I'm happy that you with your deep knowledge of ACI is exploring the capability of aci-exporter. It would be cool to try to start collect different queries or query groups related to use cases and ACI areas. May be we should add some more configuration options so not all queries has to be in one big configuration file, but enable some include option of directories and/or files. You also mention, in 2, if this is the best way to visualize it. May be we could visualize it as a node graph. I have done some work on this in https://github.com/opsdis/nodegraph-provider that is a generic way to create graph-models that can be visualized in Grafana using Grafanas own "Node graph" plugin or the "Apache ECharts" plugin.

thenodon avatar Jul 14 '23 07:07 thenodon

I will try to add some guidelines to the README related to queries, value_name and labels when using children.

thenodon avatar Jul 20 '23 08:07 thenodon