go-ceph Golang panic in case an open image is not closed

While writing a test for some RBD functions, I managed to get the following panic:

/build/ceph-12.2.12/src/log/SubsystemMap.h: In function 'bool ceph::logging::SubsystemMap::should_gather(unsigned int, int)' thread 7eff5cff9700 time 2019-12-10 09:25:21.278405
/build/ceph-12.2.12/src/log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
 ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7eff7a0dbef2]
 2: (()+0x225e1b) [0x7eff83a15e1b]
 3: (()+0x23298d) [0x7eff83a2298d]
 4: (()+0x76ba) [0x7eff82f566ba]
 5: (clone()+0x6d) [0x7eff82c8c41d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

(I am not sure where this crashes, need to get debug symbols included in the make test-docker workflow)

This seems to be easy to reproduce. Take one of the tests in rbd/rbd_test.go, create an image, open it, remove it and do further cleanup without calling Close(). This is a pretty horrible experience, and should be prevented.

Example:

        conn, _ := rados.NewConn()
        conn.ReadDefaultConfigFile()
        conn.Connect()

        poolname := GetUUID()
        err := conn.MakePool(poolname)
        assert.NoError(t, err)

        ioctx, err := conn.OpenIOContext(poolname)
        require.NoError(t, err)

        name := GetUUID()
        image, err := rbd.Create(ioctx, name, 1<<22, 22)
        assert.NoError(t, err)

        err = image.Open()
        assert.NoError(t, err)

        // err = image.Close()
        // assert.NoError(t, err)
        err = image.Remove()
        assert.NoError(t, err)

        ioctx.Destroy()
        conn.DeletePool(poolname)
        conn.Shutdown()

Dec 10 '19 09:12 nixpanic

@nixpanic , @ansiwen I think this is the same crash I was getting the other day too ... but I have not managed to clean up my experiments to get a proper reproducer.

Dec 11 '19 13:12 phlogistonjohn

@phlogistonjohn you can assign this to @ansiwen as he is looking into this.

Dec 11 '19 15:12 nixpanic

Hmm. I can't seem to at the moment. @ansiwen seems to be missing from the list. I could assign it to you tho so there's got to be some sort of groups / membership thing that's different.

Dec 11 '19 18:12 phlogistonjohn

@nixpanic @phlogistonjohn So, apparently this only crashes, when the error returned by image.Remove() is ignored, which many of the tests in rbd_test.go do (like here), and which is why I didn't notice that for a long time. That's why I'm not sure this is a bug in go-ceph or even in general in the end then. I have a simple fix that closes the image if it's still open before removing. I can push a PR if that's the way we want to go.

Nevertheless it's weird, the crash happens in the next test while creating a new pool in some logging routine. I'm currently writing a C++ reproducer to ask about this in the general ceph community.

Jan 02 '20 17:01 ansiwen

@nixpanic @phlogistonjohn So, apparently this only crashes, when the error returned by image.Remove() is ignored, which many of the tests in rbd_test.go do (like here), and which is why I didn't notice that for a long time.

That's pretty interesting and I can see why that didn't get caught quickly.

That's why I'm not sure this is a bug in go-ceph or even in general in the end then. I have a simple fix that closes the image if it's still open before removing. I can push a PR if that's the way we want to go.

I would say that we don't want to do anything extra that the api does not do, so we should avoid doing it internally if that's what you were thinking. We can make sure our test code does behave properly wrt to error checking and handling as we ought to provide good examples for consumers.

Nevertheless it's weird, the crash happens in the next test while creating a new pool in some logging routine. I'm currently writing a C++ reproducer to ask about this in the general ceph community.

That's great. It strikes me that if we crash in that path it could also be impacting C/C++ users too and so this seems like a good path to pursue. If we have to we can provide additional checks and return errors for some extra conditions but only if were "abusing" the api in some way.

Jan 02 '20 17:01 phlogistonjohn

Explicitly closing the image in case it is still open would be hiding the issue. I prefer to have an additional check like imageIsClosed:

        if err := image.validate(imageNeedsIOContext | imageNeedsName | imageIsClosed); err != nil {
                return err
        }

Reporting the issue against librbd would be good, of course! Thanks for the update :+1:

Jan 03 '20 10:01 nixpanic

Ok, here an update of my findings so far:

I can reproduce something very similar with C++ code whenever the destructor of the ImageCtx object is called after the rados connection handle is shutdown, because the ImageCtx contains a poitner to a CephContext object, which is already destructed by that time. It might be, that this is still "by design", but it appears pretty dangerous to me.

With the pure C API however, I was not yet able to reproduce that same issue, although to me it looks like I'm calling exactly the same functions in the same order. So probably using cgo is changing the timing or memory layout, so that the crash doesn't happen. With go-ceph it happens, when rados_pool_create is called in the next test, and happens for the same reason as in the C++ case, that an ObjectCacher object (which is part of an ImageCtx object) is dereferencing a dangling pointer to a CephContext object.

What I'm doing right now is to follow the code paths for both the go-ceph and pure C cases of the rados_pool_create calls, and see if I can find any differences. Given that this code is heavily multithreaded that is kind of a PITA. I saw that the code uses LLTNG for tracing, but I never used that and don't know if librados in Ubuntu is compiled with it. Also the version of valgrind in that Ubuntu version is buggy and doesn't work with the go test executable. :roll_eyes:

Jan 13 '20 15:01 ansiwen

In the meantime I was able to reproduce it with a pure C API code a couple of times as well. So here is my conclusion so far:

At least the librbd objects ImageCtx and ObjectCacher, which are both created when opening an image with rbd_open(), are keeping pointers to a CephContext object here and here, that is destroyed together with the RadosClient object during the rados_shutdown() call here. This is especially problematic, since the ObjectCacher object creates a FlusherThread that reacts to certain events, which then eventually causes the use of the dangling pointer to CephContext. Since there is no bookkeeping, what is referencing the CephContext object (for example with a signal/slot or other publish/subscribe mechanism), the invalid objects can't cleared up automatically when CephContext is destroyed. So this bug is "by design", and all images must be closed manually before calling rados_shutdown().

From conversations on #ceph-devel it seems like that is "expected". The logic stated there was, if you use a ioctx with a call like rbd_open() then you also have to make sure that you clear up everything that has been created with that ioctx before destroying it. I can still file a bug report there to start a discussion about that, but certainly it seems to be expected by now.

As a side effect of this I will push PRs for a debugging container image that I used, and a better error report very similar to what @nixpanic also proposed. That however will not prevent from ignoring the error and shutting down the rados handle with opened images.

Jan 21 '20 17:01 ansiwen

See also #153 for similar problems when calling librbd functions on an image that is closed.

Jan 22 '20 08:01 nixpanic

There's a lot to unpack in this so please bear with me if I miss anything:

In the meantime I was able to reproduce it with a pure C API code a couple of times as well. So here is my conclusion so far:

At least the librbd objects ImageCtx and ObjectCacher, which are both created when opening an image with rbd_open(), are keeping pointers to a CephContext object here and here, that is destroyed together with the RadosClient object during the rados_shutdown() call here. This is especially problematic, since the ObjectCacher object creates a FlusherThread that reacts to certain events, which then eventually causes the use of the dangling pointer to CephContext. Since there is no bookkeeping, what is referencing the CephContext object (for example with a signal/slot or other publish/subscribe mechanism), the invalid objects can't cleared up automatically when CephContext is destroyed.

OK, unless I'm misunderstanding (please correct me) it sounds like the claim is that if you use the api "correctly" there should be no crash. If all open rbd images are closed prior to closing/freeing the other resources a consumer of the api should encounter no problems. This is good news from the point-of-view of a regular consumer of the api, but does make things more difficult to comprehensibly test error conditions in the api itself.

So this bug is "by design", and all images must be closed manually before calling rados_shutdown().

I find libraries that "crash" as an error handling mechanism to be annoying to work with but I do understand why some others think its a reasonable approach when working in the native language. But when you switch to a more error focused lang (like Go) its even more annoying. :-)

From conversations on #ceph-devel it seems like that is "expected". The logic stated there was, if you use a ioctx with a call like rbd_open() then you also have to make sure that you clear up everything that has been created with that ioctx before destroying it.

Sounds like we can either document these issues or try to do some (optional?) book keeping ourselves if we want to avoid these sort of crashes.

I can still file a bug report there to start a discussion about that, but certainly it seems to be expected by now.

I think we should file something, then even if upstream ceph decides not to change anything, our concerns have been logged and we can reference the issue in our code and documentation.

As a side effect of this I will push PRs for a debugging container image that I used, and a better error report very similar to what @nixpanic also proposed. That however will not prevent from ignoring the error and shutting down the rados handle with opened images.

Sounds great.

Jan 22 '20 13:01 phlogistonjohn

After some conversation with @nixpanic while reiterating over this issue, I can confirm that the crash also happens, when the image is opened read-only and is then removed without closing it. In this case the the remove call succeeds (that is doesn't return an error), but still the following shutting down of the cluster handle segfaults. I would consider this clearly a bug now, because a successful removal of the image should indicate, that it's safe to shit down the handle.

May 04 '20 19:05 ansiwen

Created https://tracker.ceph.com/issues/45437 upstream.

May 07 '20 21:05 ansiwen

At least the librbd objects ImageCtx and ObjectCacher, which are both created when opening an image with rbd_open(), are keeping pointers to a CephContext object here and here, that is destroyed together with the RadosClient object during the rados_shutdown() call here.

You cannot close the rados connection before closing the RBD images -- I would expect all kinds of crashes are possible. If the RBD image is properly closed, there would be no remaining threads.

Jun 05 '20 17:06 dillaman

I would consider this clearly a bug now, because a successful removal of the image should indicate, that it's safe to shit down the handle.

I agree that it's a bug in go-ceph -- if the expectation is that Image.Remove() will close the image, then close the image. You cannot leave a dangling RBD image alive and then close the rados connection. That RemoveImage API shows that the C API doesn't even care that you have the image open -- you are just passing the IoCtx and image name.

Jun 05 '20 17:06 dillaman

@dillaman I would still say it's misleading that it's possible to "remove" an opened image, if it doesn't actually destroy it. Maybe there are some terminology nuances in Ceph that I don't get. Also see my comment in https://tracker.ceph.com/issues/45437

Jun 08 '20 16:06 ansiwen

I would still say it's misleading that it's possible to "remove" an opened image, if it doesn't actually destroy it.

"Remove" is about removing the backing data from the Ceph cluster -- not removing memory references. That would be a "close" operation. Again, this is a bad design in the go-ceph API if it's causing you this much confusion. The new "RemoveImage" API that was added actually matches the C/C++/Python APIs in design and since it doesn't take an image pointer/reference, it's clearer that there is 100% no possible why for it to magically close/cleanup your opened references.

Jun 08 '20 16:06 dillaman

go-ceph go-ceph copied to clipboard

Golang panic in case an open image is not closed

go-ceph
go-ceph copied to clipboard