Telemetry_Receiver icon indicating copy to clipboard operation
Telemetry_Receiver copied to clipboard

Telemetry

Telemetry Receiver

Python Telemetry Receiver for collection Cisco NX-OS IOS-XR UDP/TCP/gRPC

A Python sample code demo how to collect Telemetry mesasge via UDP/gRPC and how to Parse message.
Make it easy and fun to build your own telemetry recevier and agile handle telemetry message.
 
Test bed :  N9K and XRv9K 
    N9K x.x.x.x     version : nx-os 9.2.2    JSON / GPB-kv (Self-Description)
    xrv9k x.x.x.x    version : xrv9k 6.4.1   JSON / GPB compact / GPB-kv (Self-Description)
 
Telemetry Receiver UDP : port 57500
                   gRPC dial in : port 57400
                   gRPC dial out : port 50051

gRPC receiver working in dialin/dialout mode , passed testing with Cisco IOS XR Version 6.4.1


NOTE: max UDP protobuf length = 65535 ,header=28 ,real content = 65507 bytes

Inside internal MDT header, IOS-XR has 12 bytes header, NX-OS has 6 bytes 


 IOS-XR inside interanl header:
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |          MSG TYPE             |           ENCODING_TYPE       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |         MSG_VERSION           |           FLAGS               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                           MSG_LENGTH                          |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ..
 ~                                                               ~
 ~                      PAYLOAD (MSG_LENGTH bytes)               ~
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
 MSG TYPE (2 bytes)  = 1 (for MDT)
 ENCODING_TYPE (2 bytes) = 1 (GPB), 2 (JSON)
 MSG_VERSION (2 bytes) = 1
 FLAGS (2 bytes) = 0
 MSG_LENGTH (4 bytes)

Telemetry proto file for both GPB and GPB-kv

use the follwoing command to make python proto output:

protoc --python_out=. ./telemetry.proto 

//for GPB compact , demo message is system uptime ,use the following proto file generated by IOS-XR
protoc --python_out=. ./uptime.proto 

to make gRPC service and service encode message pb2 with following command:

python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. cisco_grpc_dialout.proto

Telemetry GPB GPB-kv Collection and parsed result as below. Also output with json result, not shown here(see code).

Telemetry GPB message from NX OS
Node :DUT02
IP Address (source port) :('10.75.58.198', 41532)
Encodig Path :sys/procsys/sysmem
GPB kv format
dn:sys/procsys/sysmem
free:26164608
memstatus:OK
name:sysmem
total:32828228
used:6663620
==================================================================
Telemetry GPB message from NX OS
Node :DUT02
IP Address (source port) :('10.75.58.198', 41532)
Encodig Path :show process cpu
GPB kv format
pid:1
runtime:16411
invoked:306652
usecs:53
onesec:0.00
process:init
=================================================================
Telemetry GPB  message from IOX
Node :XTC
IP Address (source port) :('10.75.58.60', 32898)
Encodig Path :Cisco-IOS-XR-shellutil-oper:system-time/uptime
GPB compact format
Content decoded here :
Host Name :XTC
System Up Time :1660 Seconds
=================================================================

Telemetry JSON Collection ans parsed result as below

=================================================================
('10.75.58.60', 33208)
This is a System Uptime telemetry messgae
Node =XTC
Encoding Path =Cisco-IOS-XR-shellutil-oper:system-time/uptime
Collection ID =5233
Host Name =XTC
Uptime =58169
Recevice Count =1
Timestamp from Start =7.964596271514893
=================================================================
('10.75.58.60', 33208)
This is a System Uptime telemetry messgae
Node =XTC
Encoding Path =Cisco-IOS-XR-shellutil-oper:system-time/uptime
Collection ID =5235
Host Name =XTC
Uptime =58191
Recevice Count =2
Timestamp from Start =29.962813138961792
=================================================================

IOS-XR Generate Telemetry GPB compact proto files

RP/0/RP0/CPU0:XTC#telemetry generate gpb-encoding path RootOper.QOS.Node.PolicyMap.Interface.Input.Statistics file disk0:/qos.proto
Tue Jan 22 06:54:59.603 UTC
Created /disk0:/qos.proto

* NOTE 
For Telemetry generate protofile , path should be a yang to xml schema path,
How to find yang to xml schema path?

RP/0/RP0/CPU0:XTC#run
Tue Jan 22 06:58:56.512 UTC
[xr-vm_node0_RP0_CPU0:~]$cd /pkg/telemetry/mdt/protogen
[xr-vm_node0_RP0_CPU0:/pkg/telemetry/mdt/protogen]$ls
yang_to_schema.txt
[xr-vm_node0_RP0_CPU0:/pkg/telemetry/mdt/protogen]$

yang_to_Schema.txt has all yang path to xml schema path maps.

Cisco MDS 9000 Fabric Telemetry for SAN Analytics

MDS SAN switch 32G line card push telemetry stream via gRPC GPB/GPB-kv encoding
encoding path should be NX OS analytics:your_query_name 
dial-out only
see sample code telemetry_grpc_dial_out_no_tls.py

So far , NX OS 8.4.1 only support gRPC GPB/GPB-kv
for GPB-kv encoding with fabrc_telemetry.proto file.

MDS 9710 sample configuration:
 
telemetry
    sensor-group 1
    path analytics:test_query
    path show_stats_fc2/1
    path show_stats_fc2/2
    sensor-group 2
    path analytics:dcnminitITL
    destination-group 1
    ip address 10.79.98.77 port 50051 protocol gRPC encoding GPB-compact
    destination-group 2
    ip address 10.124.2.116 port 57500 protocol gRPC encoding GPB-compact
    subscription 1
    snsr-grp 1 sample-interval 30000
    dst-grp 1
    subscription 2
    snsr-grp 2 sample-interval 30000
    dst-grp 2
    
sw-core1-9710# sh analytics query all
Total queries:2
============================
Query Name      :test_query
Query String    :select all from fc-scsi.port
Query Type      :periodic, interval 30

Query Name      :dcnminitITL
Query String    :select port, vsan, app_id, initiator_id, target_id, lun, active_io_read_count, active_io_write_count, total_read_io_count, total_write
_io_count, total_time_metric_based_read_io_count, total_time_metric_based_write_io_count,total_read_io_time, total_write_io_time, total_read_io_initiat
ion_time, total_write_io_initiation_time,total_read_io_bytes, total_write_io_bytes, total_time_metric_based_read_io_bytes, total_time_metric_based_writ
e_io_bytes, read_io_rate, write_io_rate, read_io_bandwidth, write_io_bandwidth,read_io_size_min, read_io_size_max, write_io_size_min, write_io_size_max
,read_io_completion_time_min, read_io_completion_time_max, write_io_completion_time_min, write_io_completion_time_max,read_io_initiation_time_max, writ
e_io_initiation_time_max, read_io_aborts, write_io_aborts,read_io_failures, write_io_failures, read_io_timeouts, write_io_timeouts from fc-scsi.scsi_in
itiator_itl_flow
Query Type      :periodic, interval 30
Query Options   :differential

gRPC decode sample ,please refer to san_analytics sub folder

For SAN Analytics metrics detail please refer to san_analytics sub folder

Telemetry Machine Learning Engine

Sample telemetry machine learning engine code , 
LSTM prediction and Multivariate Gaussian Distribution Abnormal Detection.

Non supervised learning , Affinity propagation clustering , for root cause detection and abnormal detection

Telemetry + prometheus lab log

Telemetry + InfluxDB lab log

Telemetry + Kafka lab log