ntirpc icon indicating copy to clipboard operation
ntirpc copied to clipboard

shrink buf changed behavior

Open ofirvin opened this issue 10 months ago • 5 comments

Hi, We have been going over a few recent commits in libntirpc. and we saw https://github.com/nfs-ganesha/ntirpc/commit/1352b7140fc4747d02501c7e92065400b5decb41 and it kind of looks like chunk_unref_locked can do a double unlock. we are not very familiar with that layer, but do you think that can happen? and if so is that ok?

ofirvin avatar Jan 29 '25 07:01 ofirvin

I'm confused, there appear to be no unlocks in chunk_unref_locked(). Did you mean a different function?

dang avatar Jan 29 '25 15:01 dang

chunk_unref_locked calls get_lru_chunk_with_lock. hich sometimes leaves ioq unlocked.

and then do_shrink can unlock ioq again if !is_same_buflist(io_buf, new_io_buf, rdma_xprt)

ofirvin avatar Feb 02 '25 07:02 ofirvin

I think it's safe. get_lru_chunk_with_lock() only unlocks ioqh->qmutex if pthread_mutex_trylock() succeeded, which means that ioq was not already locked, and we succeeded in taking the lock. The correct thing to do then is unlock it. If trylock() fails (we didn't get the lock, it was already locked, potentially by us), then we skip to the next buffer.

dang avatar Feb 03 '25 18:02 dang

I agree. actually the case I thought about is that check_data_io_buf_free in get_lru_chunk_with_lock returns false and then we break with a non null io_buf but ioqh->qmutex seems unlocked. and later the logic in chunk_unref_locked gets the io_buf as new_io_buf and calls do_shrink which may unlock it again

ofirvin avatar Feb 10 '25 09:02 ofirvin

` 353 if (pthread_mutex_trylock(&ioqh->qmutex)) { 354 /* Lock could be already taken, 355 try next io_buf. / 356 __warnx(TIRPC_DEBUG_FLAG_XDR, 357 "%s: current io_buf %p " 358 "io_buf %p xprt %p can't " 359 "shrink io_buf %prefs %d " 360 "io_bufs count %d %d", 361 func, cur_io_buf, 362 io_buf, rdma_xprt, io_buf, 363 io_buf->refs, 364 rdma_xprt->io_bufs_count, 365 rdma_xprt->io_bufs.qcount); 366 } else { 367 / Lock to shrink this io_buf 368 * caller will unlock */ 369 if (check_data_io_buf_free(====>If this returns false, we will unlock at line 372 and io_buf should be set to NULL at line 381 370 io_buf, rdma_xprt)) 371 break; 372 pthread_mutex_unlock( 373 &ioqh->qmutex); 374 } 375 } else { 376 break; 377 } 378 } 379 struct poolq_entry *next = TAILQ_NEXT(have, q); 380 have = next; 381 io_buf = NULL; 382 }

`

If is io_buf is not NULL and not from same buff list of curr_io_buff then ionly do_shrink will do unlock the try lock taken at line 353. If io_buff is part of same buff list as curr_io_buff the lock is managed by xdr_rdma_ioq_uv_recycle_io_buf.

Gaurav-Gangalwar avatar Apr 14 '25 12:04 Gaurav-Gangalwar