glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

dht: break out of recursive loop for lookups.

Open itisravi opened this issue 2 years ago • 1 comments

Problem: When two clients simultaneously create and unlink the same file in a loop (stress testing), the client doing the unlink was hung and unresponsive to CTRL-C. On examining, it was observed that when dht_lookup_cbk() failed with ENODATA (since the other client had created the file but not yet set the gfid), it was triggering a recursive loop of lookups (on the client doing the unlink):

dht_lookup_cbk ───────────► dht_lookup_directory ──►dht_lookup_dir_cbk───►dht_lookup_everywhere ───► dht_lookup_everywhere_done ▲ │ │ │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────────────▼

Fix: Set unset local->gfid_missing before "hacking" dht_lookup_everywhere_done() to again call dht_lookup_directory().

From my limited understanding of dht code, this fix seems to not break commit 2e4f6f24e4e2656cbb431e9ea34150acb6896a33 and also does not break gluster behaviour of how a fuse mount will assign a gfid to a file that is directly created on the brick.

Note: Even with this fix, there are ESTALE/ENODATA errors seen on the mount but there is no hang whatsoever. CTRL-c promptly returns.

Fixes: #3688 Change-Id: I60e2a3c83542dae59c90e4e295dbc2ea2848537f Signed-off-by: Ravishankar N [email protected]

itisravi avatar Aug 10 '22 11:08 itisravi

/run regression

itisravi avatar Aug 10 '22 11:08 itisravi

@itisravi Did you face an issue on a specific release? I have tried to reproduce it on the latest devel branch but I am not able to reproduce it.

mohit84 avatar Aug 11 '22 02:08 mohit84

@mohit84 I was able to reproduce this only in our downstream code which is based on glusterfs 9.5 plus some custom changes.

But the problem what I saw was this: When doing dht_lookup_everywhere(), in the callback dht_lookup_everywhere_cbk(), if op_ret=-1 and op_errni=ENODATA, we set local->gfid_missing = _gf_true; Then in dht_lookup_everywhere_done(), since we call dht_lookup_directory() with the same frame, and by this time if the gfid is assigned, dht_lookup_dir_cbk() will be a success. But since this is a file, it will set local->need_lookup_everywhere = 1; and in dht_lookup_everywhere_cbk(), even though its a success, local->gfid_missing is still true from the previous assignment. Then in dht_lookup_everywhere_done() it again enters the dht_lookup_directory() loop.

So basically, once local->gfid_missing is set, I don't see it being unset either in the subsequent call or callback using the same frame and local.

itisravi avatar Aug 11 '22 03:08 itisravi

If we look at dht_lookup_cbk() we see its doing this:

        } else {
            /* posix returns ENODATA if the gfid is not set but the client and
             * server protocol layers do not send the stbuf. We need to
             * heal this so check if this is a directory on the other subvols.
             */
            if ((op_errno == ENOTCONN) || (op_errno == ENODATA)) {
                dht_lookup_directory(frame, this, &local->loc);
                return 0;
            }
        }

I think this assumption is a problem. Even though posix has the correct stbuf with the ia_type details, server and client xlators mask it. So the stbuf we get at dht layer does not contain the file type and we end up calling dht_lookup_directory() irrespective of whether its a file or dir.

itisravi avatar Aug 11 '22 03:08 itisravi

@xhernandez Do you have any concerns about this?

mohit84 avatar Sep 16 '22 09:09 mohit84