Implement Pagination Support
Currently aci-exporter works fine for most configurations but on large scale fabric if a query returns too many object it might hit the Maximum response size the APIC can handle and the query will fail.
aci-exporter should implement pagination
@camrossi do you have any recommendations what would be good settings? Or do you think it should be defined on a query level?
On the system I tested the issue got triggered when grabbing vlanCktEp as well as fvAEPg when the child objects are included.
Is a bit of a trial and error but seems to me a page size of 1000 is working fine for both use cases. I also doubt anyone will have a fabric that big... I tested against the one of fabrics we use for scale testing that has something like 15K EPGs, 750K Static port bindings and 500 switches
Perhaps set a default of 1000 with an option to override it ?
@camrossi I have implemented what I believe is a solution to paging, but... Just setting page and page-size and iterate over the result until all pages are consumed is not enough. For this to work you also need the order-by query parameter to get a consistent response, otherwise the final result include duplicates and not all objects in a random way. Currently the order-by is set in an implicit way in the doGet method, but it assume that the class has a dn attribute. Not sure dn is an attribute on all classes, but it looks like it works for all the test queries I have. So the query attribute added is order-by=<class_name of the query>.dn.
Would be create if you can test it on your large cluster. Checkout branch issue_37. The paging size is hard coded to 1000, but can be made configurable. Not sure if the order-by have performance implications, but maybe you know.
Hi @thenodon !
I made a few tests: on high scale fabric we need to also pass an API timeout or the APIC might stop processing on its side. I just added &timeout=180 where you set page and page-size. 180 is the timeout we use in the APIC UI as well so should be a safe bet
This was mostly needed for the count queries but dose not hurt to have it set for all the queries.
I am still getting a few queries failing but I think is due to some other issue... but I am having some trouble using
queries=xxx syntax to execute only some queries.
If I do &queries=compound_queries doesn't execute any queries same if I do &queries=object_count do you support filtering like that?
Anyhow the filtering works with class_queries so I manage to do this test so I could focus on a single request that is failing:
curl -s 'http://localhost:9643/probe?target=scale2-fabric&queries=epg_port_vlan_binding'
This will get ~9 pages of data over 110 seconds but then the APIC will refuse the next request with a HTTP 400. There are ~20 pages to be retrieved and what I am not really clear on at the moment is why it is happening... I did the exact same thing with curl doing 20 request (without re-authenticating) and it worked just fine.
Hi @camrossi - first the queries= query attribute should be set to the name of the "query" in the config file. A query is always of one of the types:
- class_queries
- compound_queries
- group_class_queries
So from the example_config.yml the query interface_info is a class_queries. You can not set the query parameter queries= to one of the type names, only to a named query. Not sure if this is clear in the documentation.
When it comes to the problem that you do not get all the pages, I can also reproduce it. I have to set the page-size vary low. I ran the query interface_info that returns 98 entries (totalCount=98). I set the page-size to 3, which means that I have to iterate 33 times to get all entries. But what happens is that after 10 successful request the 11th get 400. So I tried to start on a different offset than 0. I get the same result setting offset to 10, page=10, but when trying request page=20 a 400 is return. So only 10 requests are successful, independent of the page-size. I also tried to remove the order-by, but it did not make any changes.
So it do not matter what offset I start from (page=X) only 10 request are successful and the next gives 400.
400 means a bad request, but the response to do not include any information what is "bad". Do you have access to the any logs on the apic side that could share some more insight to what the problem relates to? You can run this on a smaller cluster just change the line 44 in aci_connection.go to something small, as long that the response require more than 10 iterations, like:
var pageCount uint64 = 3
@camrossi I think I found it. Tested with curl and it worked as expected, like you also said. The only different is that in aci-exporter I set the header Content-Type: application/json, and when I did run with curl I did not set that header. Removed the header in aci-exporter and it works! Can not understand why this problem just happen on the call number 11 in the iteration. Still great if you access to logs to see what was the issue on apic side.
You can do a pull on the issue_37 branch and test. Have also added to the logs the number of "pages" that was collected based on the page-size.
INFO[0013] api call fabric class=ethpmPhysIf exec_time=286413 fabric=cam length=119622 method=GET pages=3 requestid=2WWFkMYD5BcK4CWV1B5J4FeA3Rk status=0 uri="https://173.36.219.190/api/class/ethpmPhysIf.json"
@camrossi related to the timeout issue. When the aci-exporter create the http client the default timeout is 0, which means no timeout. You can define it in the config.yml like:
httpclient:
insecurehttps: true
# Default is 15 seconds
keepalive: 15
# Zero means no timeout
timeout: 0
If you run from prometheus you must define prometheus scrape timeout against the exporter.
I am not sure if is the content type… setting it with CURL dosen’t result in an issue, let me investigate more on my side! Also can you dump on the logs the full HTTP request (header and body) I am not sure how to do it and currently traveling for work so don't have much time to experiment with code :D
And I can confirm that removing the encoding works even if it makes no sense...
Alright I found the real issue! the APIC-cookie keeps getting re-added on each page so on page 11 there are 11 APIC-cookie concatenated. I found this out by making http req and taking a sniffer trace:
Cookie: APIC-cookie=<DATA>; APIC-cookie=<DATA>; APIC-cookie=<DATA>
etc....
No idea why I looked at the code but I don't see why cookiejar would do this and the APIC is sendinf Set-Cookie: APIC-cookie= only on the loging method.
@camrossi Interesting findings and it helped me to understand the real problem. It had nothing to do with the header Content-Type: application/json was set or not. It was my header logic that was buggy.
I have now, hopefully, fixed this and tested with different page-size and with the original Content-Type header set. The page-size can be set in the config as:
httpclient:
pagesize: 100
....
Default is 1000
Looking forward to your verification on your monster cluster. Branch is still issue_37
@thenodon success it works.... now I am not sure if prometheus will work. The query I am testing consists of pulling all the static port bindings (765327) and that takes 6minutes. If that is too long would it make sense to return data to prometheus on a per page basis? I did pull 19 pages in those 6min.
I can't even fault the APIC too much as we are pulling static port bindings on a fabric that is pretty much at the highest multi dimensional scale you can go.
Small note: I had to increate the page size to 10K or it was gonna take too long, I think we would want to eventually support setting this per query, the limits is the size (in bytes) so depending what query you do 10K is too much or you could go even higher.
@camrossi good to hear that it worked, but I think we need to do some thinking for this large fabric use case. A scrape time higher than a minute is bad in the context of Prometheus and I do not see Prometheus managing paging in the scrape logic. That would involve complex discovery and the exporter must expose page number or page range in the api. And how many pages is it that must be scraped? I think the way forward are some of these alternatives:
- Make every page fetch parallel. Currently they are done sequential. The question is, do this burst of api calls against the apic become a resource problem for the apic? This should be fairly simple to implement and I think we should try that.
- Doing async fetching of metrics and caching the result. That could be implemented in the aci-exporter or as an external proxy cache. I have developed a number of exporters that use this pattern where the exporter collect data with its own "scrape" interval and cache it and Prometheus can scrape every minute, but the metrics in Prometheus will be the same during the interval of the exporter scrape/fetch interval. This means lower resolution of the metrics in Prometheus but may be okay. This is often the scenario for streaming metrics where the device/platform push its metrics every X minutes.
I think we should test the first alternative first if you do not have any objection. If my math is not wrong you did 19 pages in 360 sec, that would mean that each page took approx 19 sec to complete. So with a parallel approach the total max time should be x sec (single call to get the totalCount using just page size of 1) + 19 sec (the remaining pages in parallel) sec range. The question is how it affects the apic?
@camrossi I have implemented the parallel page fetch. Will be interesting to see how that effect the total processing time on your large cluster. On my small cluster I ran interface_info from the example_config.yml with a pagesize=1:
- sequential - ~24 sec
- parallel - ~3 sec
The fix is in the issue_37 branch
I will give it a try next time I get access to the cluster. Anyway this is not super urgent :D
@camrossi I forgot to mention but you can disable parallel processing in the config:
httpclient:
# Below are the default values
pagesize: 1000
# If false paging will be done sequentially for the query
parallel_paging: true
It would be nice to try to find the "sweet spot" for the normal ACI customer related how to fetch data. Or even better be able to do some analyse of the cluster on startup and from that set the different configuration parameters of how to process results. I can imaging that that the number of pods, spines and leaf are defining the size, but may be also more "virtual" objects like epg etc. Ideas?
@camrossi any feedback?
Hi @thenodon, @camrossi, I can test the parallel page fetch with my fabric. Today am up to date with the master branch. I will merge with issue_37 and give you a feedback.
Hi @ahmedaall not sure if it will merge without conflicts, but give it a try. I will hope to be able to continue to focus on this soon since for a large fabric it make so much sense. Looking forward to your feedback and test results. Great if you can share the size of the fabric if you do comparing tests.
@thenodon I have 62 switchs and 111 EPG. I confirm that this feature is quite important :). The accuracy of monitoring depends a lot on the response time. I'll keep you inform.
@thenodon Did you have the same output during the docker build :
./aci-api.go:69:26: cannot use *newAciConnection(ctx, fabricConfig) (variable of type AciConnection) as *AciConnection value in struct literal ./aci-connection.go:91:2: jar declared and not used ./aci-connection.go:91:12: undefined: cookiejar ./aci-connection.go:117:21: undefined: responseTime ./aci-connection.go:120:2: undefined: connectionCache ./aci-connection.go:121:9: undefined: connectionCache ./aci-connection.go:127:17: c.tokenProcessing undefined (type *AciConnection has no field or method tokenProcessing) ./aci-connection.go:137:4: c.tokenMutex undefined (type *AciConnection has no field or method tokenMutex) ./aci-connection.go:138:10: c.tokenMutex undefined (type *AciConnection has no field or method tokenMutex) ./aci-connection.go:171:4: c.token undefined (type *AciConnection has no field or method token) ./aci-connection.go:171:4: too many errors ERROR: process "/bin/sh -c go get -u && go mod tidy && go build -o aci-exporter *.go" did not complete successfully: exit code: 1 RUN go get -u && go mod tidy && go build -o aci-exporter *.go: ./aci-connection.go:91:2: jar declared and not used ./aci-connection.go:91:12: undefined: cookiejar ./aci-connection.go:117:21: undefined: responseTime ./aci-connection.go:120:2: undefined: connectionCache ./aci-connection.go:121:9: undefined: connectionCache ./aci-connection.go:127:17: c.tokenProcessing undefined (type *AciConnection has no field or method tokenProcessing) ./aci-connection.go:137:4: c.tokenMutex undefined (type *AciConnection has no field or method tokenMutex) ./aci-connection.go:138:10: c.tokenMutex undefined (type *AciConnection has no field or method tokenMutex) ./aci-connection.go:171:4: c.token undefined (type *AciConnection has no field or method token) ./aci-connection.go:171:4: too many errors
@ahmedaall not using docker when I developing. What branch are you on? Have you tried to merge master into branch issue_37? There are many things changed between master and branch issue_37. I think it needs more work than just a merge. I will see if I get some time this week. I have a clear idea what need to change, but doing it from the master branch.
@thenodon Yes, I merged master into issue_37 taking issue_37 as reference. Ok I'm going to rollback to the master branch while the changes are made
@ahmedaall and @camrossi I have now used the current master and created branch pagination_issue_37 with the parallel page handling that was done in branch issue_37. The README.md is updated, search for Large fabric configuration. Some notes that is important:
- The class used in the query will automatically be added in the final request url as
order-by=<class_name>.dnso you do not need to added in the configuration of the query. Without the order-by you will not get a consistent response. - When using parallel page the the first request has to be done in "single" way to get the total number of entities that will be returned. That number is than used to calculate the number of parallel requests to do in parallel with the right offset.
- I detected that if the url query include
rsp-subtree-include=countit will not work in parallel mode since its the count of items returned. So if that "string" is in the query it use single page request. Not sure if there are any more corner cases. So checkoutpagination_issue_37and give it a spin. Looking forward to your feedback and especially what, hopefully, latency decrease it gives on a larger fabric.
Thanks Anders!
This seems very promising, I will try to give it a test next week!
Cam
From: Anders Håål @.> Date: Friday, 12 January 2024 at 6:43 am To: opsdis/aci-exporter @.> Cc: Camillo Rossi (camrossi) @.>, Mention @.> Subject: Re: [opsdis/aci-exporter] Implement Pagination Support (Issue #37)
@ahmedaallhttps://github.com/ahmedaall and @camrossihttps://github.com/camrossi I have now used the current master and created branch pagination_issue_37 with the parallel page handling that was done in branch issue_37. The README.md is updated, search for Large fabric configuration. Some notes that is important:
- The class used in the query will automatically be added in the final request url as order-by=<class_name>.dn so you do not need to added in the configuration of the query. Without the order-by you will not get a consistent response.
- When using parallel page the the first request has to be done in "single" way to get the total number of entities that will be returned. That number is than used to calculate the number of parallel requests to do in parallel with the right offset.
- I detected that if the url query include rsp-subtree-include=count it will not work in parallel mode since its the count of items returned. So if that "string" is in the query it use single page request. Not sure if there are any more corner cases. So checkout pagination_issue_37 and give it a spin. Looking forward to your feedback and especially what, hopefully, latency decrease it gives on a larger fabric.
— Reply to this email directly, view it on GitHubhttps://github.com/opsdis/aci-exporter/issues/37#issuecomment-1887851547, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AD6E4PKBZSTNQ6KRBNQUWFTYOA6FJAVCNFSM6AAAAAA5KLTDFGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBXHA2TCNJUG4. You are receiving this because you were mentioned.Message ID: @.***>
Hi @thenodon I managed to succesfully merge pagination_issue_37. With the default parameters the scrapping time doesn't seems to change. Maybe I miss something. Here is my config :
# Exporter port
port: 9643
# Configuration file name default without postfix
config: config
# The prefix of the metrics
prefix:
fabrics:
# cisco_sandbox:
# username: admin
# password: ""
# apic:
# - https://sandboxapicdc.cisco.com
MY_FABRIC:
# Apic username
username: ACI_USERNAME
# Apic password.
password: ACI_PASSWORD
# # The available apic controllers
# # The aci-exporter will use the first apic it can successfully login to, starting with the first in the list
apic:
- https://LB-FABRIC
# Http client settings used to access apic
# Below is the default values, where 0 is no timeout
httpclient:
# insecurehttps: true
# keepalive: 15
# timeout: 10
# this is the max and also the default value
pagesize: 1000
# enable parallel paging, default is false
parallel_paging: true
# Http server settings - this is for the web server aci-exporter expose
# Below is the default values, where 0 is no timeout
httpserver:
# read_timeout: 0
# write_timeout: 0
# Define the output format should be in openmetrics format - deprecated from future version after 0.4.0, use below metric_format
#openmetrics: true
metric_format:
# Output in openmetrics format, default false
openmetrics: false
# Transform all label keys to lower case format, default false. E.g. oobMgmtAddr will be oobmgmtaddr
label_key_to_lower_case: true
# Transform all label keys to snake case format, default false. E.g. oobMgmtAddr will be oob_mgmt_addr
label_key_to_snake_case: false
@ahmedaall - you should run the branch pagination_issue_37, you do not have to merge. Sure you got the latest changes to that branch? What queries to you run?
@thenodon I misspoke. I run pagination_issue_37 with my custom Dockerfile and my custom yaml config.
Here is my main query :
class_queries:
interface_info:
class_name: l1PhysIf
query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
metrics:
- name: interface_speed_temp
value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
type: gauge
help: The current operational speed of the interface, in bits per second.
# value_transform:
# 'unknown': 0
# '100M': 100000000
# '1G': 1000000000
# '10G': 10000000000
# '25G': 25000000000
# '40G': 40000000000
# '100G': 100000000000
# '400G': 400000000000
- name: interface_admin_state
# The field in the json that is used as the metric value, qualified path (gjson) under imdata
value_name: l1PhysIf.attributes.adminSt
# Type
type: gauge
# Help text without prefix of metrics name
help: The current admin state of the interface.
value_transform:
'down': 0 ## ~ disabled interfaces
'up': 1 ## ~ enabled interfaces
- name: interface_oper_state
# The field in the json that is used as the metric value, qualified path (gjson) under imdata
value_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSt
# Type
type: gauge
# Help text without prefix of metrics name
help: The current operational state of the interface. (0=unknown, 1=down, 2=up, 3=link-up)
# A string to float64 transform table of the value
value_transform:
"down": 0 ## ~ disabled interfaces
"up": 1 ## ~ enabled interfaces
# The labels to extract as regex
labels:
# The field in the json used to parse the labels from
- property_name: l1PhysIf.attributes.dn
# The regex where the string enclosed in the P<xyz> is the label name
regex: "^topology/pod-(?P<pod_id>[1-9][0-9]*)/node-(?P<node_id>[1-9][0-9]*)/sys/phys-\\[(?P<interface_name>[^\\]]+)\\]"
# Ajout de l'attribut descr au champs
- property_name: l1PhysIf.attributes.descr
regex: "^(?P<interface_description>.*)"
- property_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
regex: "^(?P<speed_temp>.*)"
interface_info_more:
class_name: l1PhysIf
query_parameter: "?rsp-subtree=children&rsp-subtree-include=stats&rsp-subtree-class=ethpmPhysIf,eqptIngrBytes5min,eqptEgrBytes5min,eqptIngrDropPkts5min,eqptEgrDropPkts5min&query-target-filter=and(ne( l1PhysIf.adminSt, \"down\"))"
metrics:
- name: interface_rx_unicast
value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.unicastCum
type: counter
help: The number of unicast bytes received on the interface since it was integrated into the fabric.
unit: bytes
- name: interface_rx_multicast
value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.multicastCum
type: counter
unit: bytes
help: The number of multicast bytes received on the interface since it was integrated into the fabric.
- name: interface_rx_broadcast
value_name: l1PhysIf.children.[eqptIngrBytes5min].attributes.floodCum
type: counter
unit: bytes
help: The number of broadcast bytes received on the interface since it was integrated into the fabric.
- name: interface_rx_buffer_dropped
value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.bufferCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
buffer overrun while receiving since it was integrated into the
fabric.
- name: interface_rx_error_dropped
value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.errorCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
packet error while receiving since it was integrated into the
fabric.
- name: interface_rx_forwarding_dropped
value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.forwardingCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
forwarding issue while receiving since it was integrated into the
fabric.
- name: interface_rx_loadbal_dropped
value_name: l1PhysIf.children.[eqptIngrDropPkts5min].attributes.lbCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
load balancing issue while receiving since it was integrated into
the fabric.
- name: interface_tx_unicast
value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.unicastCum
type: counter
help: The number of unicast bytes transmitted on the interface since it was integrated into the fabric.
unit: bytes
- name: interface_tx_multicast
value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.multicastCum
type: counter
unit: bytes
help: The number of multicast bytes transmitted on the interface since it was integrated into the fabric.
- name: interface_tx_broadcast
value_name: l1PhysIf.children.[eqptEgrBytes5min].attributes.floodCum
type: counter
unit: bytes
help: The number of broadcast bytes transmitted on the interface since it was integrated into the fabric.
- name: interface_tx_queue_dropped
value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.afdWredCum
type: counter
unit: pkts
help: The number of packets dropped by the interface during queue
management while transmitting since it was integrated into the
fabric.
- name: interface_tx_buffer_dropped
value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.bufferCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
buffer overrun while transmitting since it was integrated into the
fabric.
- name: interface_tx_error_dropped
value_name: l1PhysIf.children.[eqptEgrDropPkts5min].attributes.errorCum
type: counter
unit: pkts
help: The number of packets dropped by the interface due to a
packet error while transmitting since it was integrated into the
fabric.
# The labels to extract as regex
labels:
# The field in the json used to parse the labels from
- property_name: l1PhysIf.attributes.dn
# The regex where the string enclosed in the P<xyz> is the label name
regex: "^topology/pod-(?P<pod_id>[1-9][0-9]*)/node-(?P<node_id>[1-9][0-9]*)/sys/phys-\\[(?P<interface_name>[^\\]]+)\\]"
# Ajout de l'attribut descr au champs
- property_name: l1PhysIf.attributes.descr
regex: "^(?P<interface_description>.*)"
- property_name: l1PhysIf.children.[ethpmPhysIf].attributes.operSpeed
regex: "^(?P<speed_temp>.*)"
Hi @thenodon. Do you have any clue about some misconfig that I could have make for parallel pagination ? For information, the syntax that you gave me in https://github.com/opsdis/aci-exporter/issues/41, works pretty good to reduce scrapping time.
@ahmedaall sorry but have not had the time yet. I hope that also @camrossi would have a change to test on his mega fabric. I do not see that parallel paging will have a major impact on a normal sized fabric. If the payload will fit in the max size paging will not improve the latency. In the end the biggest impact of the latency is the time spent in the apic responding to the api call. The suggestion I made in #41 will not directly improve the latency, but instead of waiting for the query that take the longest time all separate queries will finish without waiting for the query with the largest latency.