ceph-csi
ceph-csi copied to clipboard
cephfs csi error,mds mds status Start request repeated too quickly. Failed with result 'signal'.
Describe the bug
A clear and concise description of what the bug is.
Environment details
- Image/version of Ceph CSI driver : v3.10.0 / latest
- Helm chart version :
- Kernel version :
- Mounter used for mounting PVC (for cephFS its
fuseorkernel. for rbd itskrbdorrbd-nbd) : - Kubernetes cluster version : 1.27
- Ceph cluster version : 18.2.2
MDS logs
(MDSRank::ProgressThread::entry()+0xbe) [0x56ddc480489e]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 20: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x7906280a8134]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 21: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x7906281287dc]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 0> 2024-07-02T13:23:53.435+0800 79061b6006c0 -1 *** Caught signal (Aborted) **
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: in thread 79061b6006c0 thread_name:mds_rank_progr
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable)
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x79062805b050]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x7906280a9e2c]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 3: gsignal()
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 4: abort()
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0x9d919) [0x790627e9d919]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa8e1a) [0x790627ea8e1a]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 7: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa8e85) [0x790627ea8e85]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa90d8) [0x790627ea90d8]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 9: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, char*)+0xb4) [0x7906288483d4]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 10: (void decode_noshare<mempool::mds_co::pool_allocator>(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, mempool::mds_co::pool_allocator<char> >, ceph::buffer::v15_2_0::ptr, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, mempool::mds_co::pool_allocator<char> > >, mempool::mds_co::pool_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, mempool::mds_co::pool_allocator<char> > const, ceph::buffer::v15_2_0::ptr> > >&, ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xd9) [0x56ddc48d64a9]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 11: (Server::prepare_new_inode(boost::intrusive_ptr<MDRequestImpl>&, CDir*, inodeno_t, unsigned int, file_layout_t const*)+0x12b3) [0x56ddc48695e3]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 12: (Server::handle_client_mkdir(boost::intrusive_ptr<MDRequestImpl>&)+0x1dc) [0x56ddc48a0dbc]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 13: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x61f) [0x56ddc48abebf]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 14: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x2d3) [0x56ddc48b0a13]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 15: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x59b) [0x56ddc48051eb]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 16: (MDSRank::retry_dispatch(boost::intrusive_ptr<Message const> const&)+0x12) [0x56ddc4805992]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 17: (MDSContext::complete(int)+0x5b) [0x56ddc4b2c45b]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 18: (MDSRank::_advance_queues()+0x78) [0x56ddc4804328]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 19: (MDSRank::ProgressThread::entry()+0xbe) [0x56ddc480489e]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 20: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x7906280a8134]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: 21: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x7906281287dc]
Jul 02 13:23:53 pve24052603 ceph-mds[3604713]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Jul 02 13:23:53 pve24052603 systemd[1]: [email protected]: Main process exited, code=killed, status=6/ABRT
Jul 02 13:23:53 pve24052603 systemd[1]: [email protected]: Failed with result 'signal'.
Jul 02 13:23:53 pve24052603 systemd[1]: [email protected]: Scheduled restart job, restart counter is at 3.
Jul 02 13:23:53 pve24052603 systemd[1]: Stopped [email protected] - Ceph metadata server daemon.
Jul 02 13:23:53 pve24052603 systemd[1]: [email protected]: Start request repeated too quickly.
Jul 02 13:23:53 pve24052603 systemd[1]: [email protected]: Failed with result 'signal'.
Jul 02 13:23:53 pve24052603 systemd[1]: Failed to start [email protected] - Ceph metadata server daemon.
systemctl status ceph-mds@pve24060802
1]: [email protected]: Scheduled restart job, restart counter is at 3.
Jul 02 13:24:24 pve24060802 systemd[1]: Stopped [email protected] - Ceph metadata server daemon.
Jul 02 13:24:24 pve24060802 systemd[1]: [email protected]: Start request repeated too quickly.
Jul 02 13:24:24 pve24060802 systemd[1]: [email protected]: Failed with result 'signal'.
Jul 02 13:24:24 pve24060802 systemd[1]: Failed to start [email protected] - Ceph metadata server daemon.
unhandled message 0x55b3c1551b80 client_metrics [client_metric_type: CAP_INFO cap_hits: 22 cap_misses: 1 num_caps: 3][client_metric_type: READ_LATENCY latency: 0.000000, avg_latency: 0.000000, sq_sum: 0, count=0][client_metric_type: WRITE_LATENCY latency: 0.000000, avg_latency: 0.000000, sq_sum: 0, count=0][client_metric_type: METADATA_LATENCY latency: 0.042123, avg_latency: 0.010530, sq_sum: 1234407162999340, count=4][client_metric_type: DENTRY_LEASE dlease_hits: 0 dlease_misses: 0 num_dentries: 2][client_metric_type: OPENED_FILES opened_files: 0 total_inodes: 3][client_metric_type: PINNED_ICAPS pinned_icaps: 3 total_inodes: 3][client_metric_type: OPENED_INODES opened_inodes: 0 total_inodes: 3][client_metric_type: READ_IO_SIZES total_ops: 0 total_size: 0][client_metric_type: WRITE_IO_SIZES total_ops: 0 total_size: 0] v1 from client.128861978 v1:10.18.10.104:0/2284126948
nginx.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-test-pvc
namespace: elk
spec:
storageClassName: csi-cephfs-sc
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: csi-rbd-demo-pod
spec:
volumes:
- name: nginxtestpvc
persistentVolumeClaim:
claimName: nginx-test-pvc
readOnly: false
containers:
- name: web-server
image: docker.io/library/nginx:latest
volumeMounts:
# - name: es-snap-pvc
# mountPath: /var/lib/www/html/plugins
- name: nginxtestpvc
mountPath: /var/lib/www/html
# - name: es-plugin-pvc
# mountPath: /var/lib/www/html/config
# - name: es-config-pvc
# mountPath: /var/lib/www/html/data
# volumes:
# - name: es-snap-pvc
# persistentVolumeClaim:
# claimName: es-snap-pvc
# readOnly: false
#
# - name: es-plugin-pvc
# persistentVolumeClaim:
# claimName: es-snap-pvc
# readOnly: false
#
# - name: es-config-pvc
# persistentVolumeClaim:
# claimName: es-snap-pvc
# readOnly: false
sc.yaml#
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
# (required) String representing a Ceph cluster to provision storage from.
# Should be unique across all Ceph clusters in use for provisioning,
# cannot be greater than 36 bytes in length, and should remain immutable for
# the lifetime of the StorageClass in use.
# Ensure to create an entry in the configmap named ceph-csi-config, based on
# csi-config-map-sample.yaml, to accompany the string chosen to
# represent the Ceph cluster in clusterID below
clusterID: "6254728d-2db0-4fe1-bab6-e415d0c697f6"
# (required) CephFS filesystem name into which the volume shall be created
# eg: fsName: myfs
fsName: cephfs-hdd-local
# (optional) Ceph pool into which volume data shall be stored
# pool: <cephfs-data-pool>
# (optional) Comma separated string of Ceph-fuse mount options.
# For eg:
# fuseMountOptions: debug
# (optional) Comma separated string of Cephfs kernel mount options.
# Check man mount.ceph for mount options. For eg:
# kernelMountOptions: readdir_max_bytes=1048576,norbytes
# The secrets have to contain user and/or Ceph admin credentials.
csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
csi.storage.k8s.io/controller-expand-secret-namespace: default
csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
csi.storage.k8s.io/node-stage-secret-namespace: default
# (optional) The driver can use either ceph-fuse (fuse) or
# ceph kernelclient (kernel).
# If omitted, default volume mounter will be used - this is
# determined by probing for ceph-fuse and mount.ceph
# mounter: kernel
# (optional) Prefix to use for naming subvolumes.
# If omitted, defaults to "csi-vol-".
# volumeNamePrefix: "foo-bar-"
# (optional) Boolean value. The PVC shall be backed by the CephFS snapshot
# specified in its data source. `pool` parameter must not be specified.
# (defaults to `true`)
# backingSnapshot: "false"
# (optional) Instruct the plugin it has to encrypt the volume
# By default it is disabled. Valid values are "true" or "false".
# A string is expected here, i.e. "true", not true.
# encrypted: "true"
# (optional) Use external key management system for encryption passphrases by
# specifying a unique ID matching KMS ConfigMap. The ID is only used for
# correlation to configmap entry.
# encryptionKMSID: <kms-config-id>
reclaimPolicy: Delete
allowVolumeExpansion: true
# mountOptions:
# - context="system_u:object_r:container_file_t:s0:c0,c1"
@gaocheng001 Could you please explain why you think the error is caused by cephcsi?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.