calamari
calamari copied to clipboard
Calamari see hosts but says no cluster
Hi there,
I'm having huge trouble setting up calamari over a two nodes ceph cluster (this is for lab purpose). I have two physical servers with Centos 7 and a CEPH cluster with them (node 1 is admin, mon + OSD, and node 2 is just OSD).
On top of this i'm trying to add Calamari, so i created a CentOS 7 virtual machine on my labtop. After installing calamari, i'm connecting the hosts manually, the key are accepted by the master, but the graphic interface is showing me this time and times :
Can anyone help me around these ? Already trying to initialize, restart cthulu, restart server, etc, and i'm now struggling to go further in debugging.
Thanks for any help, P.Chevallier
On your calamari server execute this command. check if there are any errors: salt '*' ceph.get_heartbeats
I have the same problem. I executed the command and got this
an_dev-cp1.aeronet.dev: The minion function caused an exception: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1020, in _thread_return return_data = func(*args, **kwargs) File "/var/cache/salt/minion/extmods/modules/ceph.py", line 467, in get_heartbeats service_data = service_status(filename) File "/var/cache/salt/minion/extmods/modules/ceph.py", line 526, in service_status fsid = json.loads(admin_socket(socket_path, ['status'], 'json'))['cluster_fsid'] KeyError: 'cluster_fsid'
I installed Salt2014.7, Calamari1.3.1, Ubuntu16.04
hello,
Any updates ?
Thanks, -Ali
first ,execute this command on your calamari server,and then paste the result here
I have the same problem. I executed the command and got this
an_dev-cp1.aeronet.dev: The minion function caused an exception: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1020, in _thread_return return_data = func(*args, **kwargs) File "/var/cache/salt/minion/extmods/modules/ceph.py", line 467, in get_heartbeats service_data = service_status(filename) File "/var/cache/salt/minion/extmods/modules/ceph.py", line 526, in service_status fsid = json.loads(admin_socket(socket_path, ['status'], 'json'))['cluster_fsid'] KeyError: 'cluster_fsid'
I installed Salt2014.7, Calamari1.3.1, Ubuntu16.04
I have the same question on CentOS7,how did you solve the Error? I only find a simple reply from those links:
update:
Please try the follow change: https://github.com/ceph/calamari/blob/1.3/salt/srv/salt/_modules/ceph.py: L594 Change AdminSocketError to Exception. your local file positon maybe: /opt/calamari/salt/salt/_modules/ceph.py : L594
then run command: salt "*" saltutil.sync_all. or restart salt-minion services of every ceph nodes.
Please try the follow change: https://github.com/ceph/calamari/blob/1.3/salt/srv/salt/_modules/ceph.py: L594 Change AdminSocketError to Exception. your local file positon maybe: /opt/calamari/salt/salt/_modules/ceph.py : L594
then run command: salt "*" saltutil.sync_all. or restart salt-minion services of every ceph nodes.
I'm very grateful for your reply, and by modifying the code, my work has taken me one step further. I will record my own process below to help those who might encounter this problem:
- I excute
salt '*' ceph.get_heartbeats
,i got this:
node2:
The minion function caused an exception: Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1200, in _thread_return
return_data = func(*args, **kwargs)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 534, in get_heartbeats
service_data = service_status(filename)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 593, in service_status
fsid = json.loads(admin_socket(socket_path, ['status'], 'json'))['cluster_fsid']
KeyError: 'cluster_fsid'
…… # It is same.
The minion function caused an exception: Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1200, in _thread_return
return_data = func(*args, **kwargs)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 534, in get_heartbeats
service_data = service_status(filename)
File "/var/cache/salt/minion/extmods/modules/ceph.py", line 593, in service_status
fsid = json.loads(admin_socket(socket_path, ['status'], 'json'))['cluster_fsid']
KeyError: 'cluster_fsid'
- change
AdminSocketError
to(AdminSocketError,KeyError)
try:
fsid = json.loads(admin_socket(socket_path, ['status'], 'json'))['cluster_fsid']
except (AdminSocketError,KeyError): # also,you can use Exception
# older osd/mds daemons don't support 'status'; try our best
pass #(code here don't change)
hit: the code may be /opt/calamari/salt/salt/_modules/ceph.py
in admin node and /var/cache/salt/minion/extmods/modules/ceph.py
in other node .
- excute
salt "*" saltutil.sync_all
in admin node,i got this:
node3:
----------
beacons:
grains:
modules:
output:
renderers:
returners:
sdb:
states:
utils:
……
node2:
----------
beacons:
grains:
modules:
output:
renderers:
returners:
sdb:
states:
utils:
- then,excute
salt '*' ceph.get_heartbeats
, i got this:
node2:
|_
----------
boot_time:
1573005001
ceph_version:
2:13.2.6-0.el7
services:
----------
ceph-mgr.node2:
----------
cluster:
ceph
fsid:
47071b01-394e-4a62-bb2d-cfe3c19637f7
id:
node2
status:
None
type:
mgr
version:
13.2.6
ceph-osd.0:
----------
cluster:
ceph
fsid:
47071b01-394e-4a62-bb2d-cfe3c19637f7
id:
0
status:
None
type:
osd
version:
13.2.6
|_
----------
……
- visit homepage,it barely worked:
I will continue to try to track the issue, and if there is progress later, I will leave a record here.😊
I find some issues here:
tailf /var/log/calamari/calamari.log
I got this:
2019-11-05 21:02:19,605 - metric_access - django.request No graphite data for ceph.cluster.47071b01-394e-4a62-bb2d-cfe3c19637f7.df.total_used_bytes
2019-11-05 21:02:19,606 - metric_access - django.request No graphite data for ceph.cluster.47071b01-394e-4a62-bb2d-cfe3c19637f7.df.total_used
2019-11-05 21:02:19,606 - metric_access - django.request No graphite data for ceph.cluster.47071b01-394e-4a62-bb2d-cfe3c19637f7.df.total_space
2019-11-05 21:02:19,607 - metric_access - django.request No graphite data for ceph.cluster.47071b01-394e-4a62-bb2d-cfe3c19637f7.df.total_avail
2019-11-05 21:02:19,608 - ERROR - django.request Internal Server Error: /api/v1/cluster/47071b01-394e-4a62-bb2d-cfe3c19637f7/space
Traceback (most recent call last):
File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/handlers/base.py", line 115, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/viewsets.py", line 78, in view
return self.dispatch(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/rpc_view.py", line 94, in dispatch
self.client.close()
File "/opt/calamari/venv/lib/python2.7/site-packages/zerorpc/core.py", line 293, in close
SocketBase.close(self)
File "/opt/calamari/venv/lib/python2.7/site-packages/zerorpc/socket.py", line 37, in close
self._events.close()
File "/opt/calamari/venv/lib/python2.7/site-packages/zerorpc/events.py", line 198, in close
self._send.close()
File "/opt/calamari/venv/lib/python2.7/site-packages/zerorpc/events.py", line 50, in close
self._send_task.kill()
File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/greenlet.py", line 235, in kill
waiter.get()
File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/hub.py", line 575, in get
return self.hub.switch()
File "/opt/calamari/venv/lib/python2.7/site-packages/gevent/hub.py", line 338, in switch
return greenlet.switch(self)
LostRemote: Lost remote after 10s heartbeat
------
2019-11-05 23:59:43,586 - ERROR - django.request Internal Server Error: /api/v1/cluster/47071b01-394e-4a62-bb2d-cfe3c19637f7/health_counters
Traceback (most recent call last):
File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/handlers/base.py", line 115, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/viewsets.py", line 78, in view
return self.dispatch(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/rpc_view.py", line 91, in dispatch
return super(RPCViewSet, self).dispatch(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
return view_func(*args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/views.py", line 399, in dispatch
response = self.handle_exception(exc)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/rpc_view.py", line 108, in handle_exception
return super(RPCViewSet, self).handle_exception(exc)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/views.py", line 396, in dispatch
response = handler(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py", line 315, in get
counters = self.generate(osd_data, mds_data, mon_status, pg_summary)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py", line 167, in generate
'mds': cls._calculate_mds_counters(mds_map),
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py", line 295, in _calculate_mds_counters
up = len(mds_map['up'])
TypeError: 'NoneType' object has no attribute '__getitem__'
--------
2019-11-06 00:00:53,567 - ERROR - django.request Internal Server Error: /api/v1/cluster/47071b01-394e-4a62-bb2d-cfe3c19637f7/osd
Traceback (most recent call last):
File "/opt/calamari/venv/lib/python2.7/site-packages/django/core/handlers/base.py", line 115, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/viewsets.py", line 78, in view
return self.dispatch(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/rpc_view.py", line 91, in dispatch
return super(RPCViewSet, self).dispatch(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
return view_func(*args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/views.py", line 399, in dispatch
response = self.handle_exception(exc)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/rpc_view.py", line 108, in handle_exception
return super(RPCViewSet, self).handle_exception(exc)
File "/opt/calamari/venv/lib/python2.7/site-packages/rest_framework/views.py", line 396, in dispatch
response = handler(request, *args, **kwargs)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py", line 417, in get
osds, osds_by_pg_state = self.generate(pg_summary, osd_map, server_info, servers)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py", line 365, in generate
for pool_id, osds in osd_map.osds_by_pool.items():
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_common-0.1-py2.7.egg/calamari_common/util.py", line 8, in wrapper
rv = function(*args)
File "/opt/calamari/venv/lib/python2.7/site-packages/calamari_common-0.1-py2.7.egg/calamari_common/types.py", line 206, in osds_by_pool
for rule in [r for r in self.data['crush']['rules'] if r['ruleset'] == pool['crush_ruleset']]:
KeyError: 'crush_ruleset'
2019-11-06 00:00:54,566 - metric_access - django.request No graphite data for ceph.cluster.47071b01-394e-4a62-bb2d-cfe3c19637f7.pool.1.num_objects
then i change code here:vi /opt/calamari/venv/lib/python2.7/site-packages/calamari_rest_api-0.1-py2.7.egg/calamari_rest/views/v1.py +295
@classmethod
def _calculate_mds_counters(cls, mds_map):
log.debug("_calculate_mds_counters %s" % mds_map)
if mds_map is not None:
up = len(mds_map['up'])
inn = len(mds_map['in'])
total = len(mds_map['info'])
else: # codes here is informal
total = 0
inn = 0
up = 0
return {
'total': total,
'up_in': inn,
'up_not_in': up - inn,
'not_up_not_in': total - up,
}
and code here vi /opt/calamari/venv/lib/python2.7/site-packages/calamari_common-0.1-py2.7.egg/calamari_common/types.py +206
:
@property
@memoize
def osds_by_pool(self):
"""
Get the OSDS which may be used in this pool
:return dict of pool ID to OSD IDs in the pool
"""
result = {}
for pool_id, pool in self.pools_by_id.items():
osds = None
if pool and pool.get('crush_ruleset', None):
for rule in [r for r in self.data['crush']['rules'] if r['ruleset'] == pool['crush_ruleset']]:
if rule['min_size'] <= pool['size'] <= rule['max_size']:
osds = self.osds_by_rule_id[rule['rule_id']]
after then,I restart the server salt-minion
:
Maybe there are things that are not satisfactory,i will go on to resolve them.