zbx_redis_template
zbx_redis_template copied to clipboard
Redis 3.0 and Zabbix monitoring
We're using the recently released Redis 3.0 with clustering enabled.
I have Zabbix monitoring configured via cron, pushing data to our Zabbix server.
Every so often a key the zabbix python script sends to the localhost redis node errors:
# /etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py localhost -p 6379
Traceback (most recent call last):
File "/etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py", line 145, in <module>
main()
File "/etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py", line 137, in main
if client.type(key) == 'list':
File "/usr/lib/python2.6/site-packages/redis/client.py", line 1112, in type
return self.execute_command('TYPE', name)
File "/usr/lib/python2.6/site-packages/redis/client.py", line 565, in execute_command
return self.parse_response(connection, command_name, **options)
File "/usr/lib/python2.6/site-packages/redis/client.py", line 577, in parse_response
response = connection.read_response()
File "/usr/lib/python2.6/site-packages/redis/connection.py", line 574, in read_response
raise response
redis.exceptions.ResponseError: MOVED 8833 10.139.103.247:6379
This is on a cluster slave. The IP is the cluster master.
To fix this, I had to perform a FLUSHDB on the cluster master. This is less than ideal.
I'll look through the code to find what keys this Zabbix python script is using and see if I can narrow it down.
After some addition use, it error occurs when a cluster slave failovers as the new master. The error then occurs on both slaves. Even after performing a 'cluster failover' back to the original master, the two slaves continue to produce this error while the master is fine.
We are no longer able to monitor the clustered slaves as no new data is able to make it back to the Zabbix server.
OK I may have found the culprit.
The client.keys(*) appears to be the issue. In a clustered state (sharding) keys can exist on another node.
I commented out the following:
134 #keys = client.keys('*')
135 #llensum = 0
136 #for key in keys:
137 # if client.type(key) == 'list':
138 # llensum += client.llen(key)
139 #a.append(Metric(redis_hostname, 'redis[llenall]', llensum))
And I'm no longer getting these errors, and data is making its way back to the zabbix server.