ceph-csi icon indicating copy to clipboard operation
ceph-csi copied to clipboard

explore capturing of librbd logs when using go-ceph apis

Open Nikhil-Ladha opened this issue 7 months ago • 6 comments

Describe the feature you'd like to have

We need to explore how we can capture librbd logs when using go-ceph apis to execute ceph rbd commands. This would allow us to gather more details related to rbd commands and help in smoother debugging in case of any issues.

What is the value to the end user? (why is it a priority?)

With the librbd logs available we would be able to pinpoint issues in ceph rbd easily by backtracking the logs and helping the ceph rbd team to identify the issues easily.

Nikhil-Ladha avatar May 22 '25 07:05 Nikhil-Ladha

its possible today if we create the configmap like below

ceph.conf: |
    [global]
    auth_cluster_required = cephx
    auth_service_required = cephx
    auth_client_required = cephx
    rbd_validate_pool = false
    log_to_stderr = true
    debug_rbd = 20
    debug_rados = 20
2025-05-22T07:11:09.140+0000 7f606d7fa640 20 librbd::io::FlushTracker: 0x7f608c0587f0 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640 20 librbd::io::AsyncOperation: 0x7f604c001d70 finish_op
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 handle_shut_down_image_dispatcher: r=0
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 send_shut_down_object_dispatcher
2025-05-22T07:11:09.140+0000 7f606d7fa640  5 librbd::io::Dispatcher: 0x7f608c051b70 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640  5 librbd::io::ObjectDispatch: 0x7f608c057930 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640  5 librbd::io::SimpleSchedulerObjectDispatch: 0x7f608c054960 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640 20 librbd::io::FlushTracker: 0x7f6050019320 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640  5 librbd::cache::WriteAroundObjectDispatch: 0x7f60480033e0 shut_down: 
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 handle_shut_down_object_dispatcher: r=0
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 send_flush_op_work_queue
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 handle_flush_op_work_queue: r=0
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::image::CloseRequest: 0x7f608c059700 handle_flush_image_watcher: r=0
2025-05-22T07:11:09.140+0000 7f606d7fa640 10 librbd::ImageState: 0x7f608c051af0 0x7f608c051af0 handle_close: r=0
2025-05-22T07:11:09.140+0000 7f6058ff9640 10 librbd::ImageCtx: 0x7f608c0089a0 ~ImageCtx
2025-05-22T07:11:09.140+0000 7f6058ff9640 20 librados: flush_aio_writes
2025-05-22T07:11:09.140+0000 7f6058ff9640 20 librados: flush_aio_writes
2025-05-22T07:11:09.140+0000 7f6058ff9640 20 librbd::AsioEngine: 0x7f608c02e240 ~AsioEngine: 
2025-05-22T07:11:09.140+0000 7f6058ff9640 20 librbd::asio::ContextWQ: 0x7f608c02f4e0 ~ContextWQ: 
2025-05-22T07:11:09.140+0000 7f6058ff9640 20 librbd::asio::ContextWQ: 0x7f608c02f4e0 drain: 
2025-05-22T07:11:09.140+0000 7f60a4b95640 10 librados: omap-set-vals oid=csi.volume.f525be9e-41b3-4bc6-bce9-4c23779eca25 nspace=
2025-05-22T07:11:09.147+0000 7f60a4b95640 10 librados: Objecter returned from omap-set-vals r=0
I0522 07:11:09.148996       1 omap.go:159] ID: 18 Req-ID: pvc-e374f529-3756-4f9e-845a-8b1da311e572 set omap keys (pool="replicapool", namespace="", name="csi.volume.f525be9e-41b3-4bc6-bce9-4c23779eca25"): map[csi.imageid:6c70af4dc7ef])

@Nikhil-Ladha sorry i forgot about it. Lets close this one

Madhu-1 avatar May 22 '25 07:05 Madhu-1

I think this is important enough to document somewhere. Maybe in a troubleshooting guide, or else in the developers guide.

nixpanic avatar May 22 '25 07:05 nixpanic

@Madhu-1 what would be the steps if ceph-csi-operator is managing the deployment? Is it the same?

Nikhil-Ladha avatar May 22 '25 07:05 Nikhil-Ladha

@Nikhil-Ladha csi-operator doesnt support it directly but it can be achieved, that because we have a open item in ceph-csi to support ceph.conf per ceph cluster not a single ceph.conf for all the ceph clusters.

Madhu-1 avatar May 22 '25 07:05 Madhu-1

We can also enable the librbd logs in the csi-rbdplugin container by executing the commands in the toolbox. This works for both rook+csi and ceph-csi-op+csi setup.

ceph config set global debug_rbd 30
ceph config set global log_to_stderr true

P.S: It doesn't require any restart as well 😉

Nikhil-Ladha avatar May 27 '25 06:05 Nikhil-Ladha

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 29 '25 21:06 github-actions[bot]