Open
pmawsonau
opened this issue 3 years ago
•
4 comments
We are running snmpcollector 0.8.0 (yes we need to upgrade) to poll a number of devices. Our network is made up of Juniper and Cisco.
We are having issues in particular with Cisco NCS. Every 6 hours (which lines up with the full walk) it fails to poll for up to 20minutes. The logs show this
time="2021-05-04 01:25:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:25:00" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:25:00" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:25:00" level=info msg="STATS SNMP GET: snmp polling took [0.911181 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:25:00" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:25:00" level=info msg="STATS INFLUX: influx send took [0.000008 seconds]"
time="2021-05-04 01:30:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:30:00" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:30:00" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:30:00" level=info msg="STATS SNMP GET: snmp polling took [0.914380 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:30:00" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:30:00" level=info msg="STATS INFLUX: influx send took [0.000003 seconds]"
time="2021-05-04 01:35:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:35:01" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:35:01" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:35:01" level=info msg="STATS SNMP GET: snmp polling took [1.010677 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:35:01" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:35:01" level=info msg="STATS INFLUX: influx send took [0.000002 seconds]"
time="2021-05-04 01:40:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:40:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.6) get error: Request timeout (after 1 retries)
time="2021-05-04 01:41:00" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.10) get error: Request timeout (after 1 retries)
time="2021-05-04 01:41:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.18) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:00" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.15) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.2.2.1.8) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:30" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:42:30" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:HundredGigE0/0/0/19]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:238289135 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.4169110
17 LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[ifName:HundredGigE0/0/0/19 hostname:snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue:<NOMON> LINK | SNIP | et-0/0/0 | 0 CurValue: LastValue: CurTime:2021-05-
04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[DC:SNIP STATE:NT ifName:HundredGigE0/0/0/19 snip AS:9669]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:100000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 La
stTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT ifName:HundredGigE0/0/0/19 snip ]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:1 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT ifName:HundredGigE0/0/0/19 snip ]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:54151875 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017
LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime
:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTim
e:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue: CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:0001-
01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[ STATE:NT ifName:TenGigE0/0/0/41/3 snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:10000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastT
ime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:2 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime
:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[DC:SNIP ifName:HundredGigE0/0/0/40 STATE:NT snip AS:9669]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT snip ifName:HundredGigE0/0/0/40]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastT
ime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[ ifName:HundredGigE0/0/0/40 STATE:NT snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue: CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:000
1-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[ ifName:HundredGigE0/0/0/40 STATE:NT snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:100000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 La
stTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[snip ifName:HundredGigE0/0/0/40 STATE:NT]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:2 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT snip ifName:HundredGigE0/0/0/45]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
<SNIP>
We have tweaked get bulk and fiddled with Max repetitions between 30 and 100, and even disabled it with no changes.
Hi,
Apologies for delay.
This issue is across all our collectors (we have a number), but for this example I will look at syd01 only
Total Devices: 137
NCSs included in total: 15
Device Setting Timeout: 60
Device Setting Retries: 2
Polling Settings Freq: 300
Polling Settings UpdateFitFreq: 60
According to your configuration, the measurement index (mesurement walks + filters) will be each 5 hours, not 6
Anyway, we do not discard any SNMPCollector issue, so, please:
Update to 0.11 version (we have updated the core SNMP lib and other improvements)
Try to create a new instance with only one Cisco NCS and set the UpdateFiltFreq to lower value (i.e -6) and check if the timeout period fits into the "full walk" (i.e - every 30 minutes)