pmdk icon indicating copy to clipboard operation
pmdk copied to clipboard

obj: handle ENOMEM during vector allocations

Open bfaccini opened this issue 2 years ago • 12 comments

When missing, Handle ENOMEM during vector allocations to avoid later crashing.

Ref: pmem/issues #5515 Signed-off-by: Bruno Faccini [email protected]


This change is Reviewable

bfaccini avatar Feb 13 '23 13:02 bfaccini

you can squash all the commits into a single one, please

Ok

Since you're using an Intel e-mail address for your commits, you have to update all dates to -2023. E.g., in this file: 2016-2023

Ok

  if (operation_add_user_buffer(ctx, &userbuf) == -1);

is this ; expected here?

Oops, sorry about the typo :-(

All these changes will be in next commit.

bfaccini avatar Mar 01 '23 08:03 bfaccini

just wondering, should we clean something up in this case? I can see we're persisting something above 😉

Well, I thought this kind of concern could araise when I started to work on how to fix this, as usual for error paths... The problem is that I don't know what exactly does/provisions pmemops_persist() nor I have been able to identify which method/operation can revert its effects... Also, and strangely (!!), I only found one error path after pmemops_persist() has been called (other place do not show any error path at all !!...), in src/libpmemobj/obj.c:obj_runtime_init(), and there is no clean-up there too !!

bfaccini avatar Mar 01 '23 20:03 bfaccini

@bfaccini are you going to fix this PR to pass tests?

grom72 avatar Apr 04 '23 17:04 grom72

@bfaccini are you going to fix this PR to pass tests?

Looks like embeding VEC_PUSH_BACK() in ASSERT() affects VEC_PUSH_BACK() execution, surprisingly for non-debug build only, preventing next vector allocation...

bfaccini avatar Apr 06 '23 14:04 bfaccini

Well, now that I have fixed the original errors/SEGVs during "./RUNTESTS -b nondebug obj_action -s TEST0", I now get new ones/asserts/aborts, but unfortunately I am unable to reproduce any of them locally :-( @grom72 , any help/idea would be appreciated.

bfaccini avatar Apr 07 '23 14:04 bfaccini

Well, now that I have fixed the original errors/SEGVs during "./RUNTESTS -b nondebug obj_action -s TEST0", I now get new ones/asserts/aborts, but unfortunately I am unable to reproduce any of them locally :-(

Now that I have found that these asserts/aborts were due to the "wrong/reverse test" in my previous changes, some workflows are canceled and one has failed, but I am not able to find any reason in the logs... Can somebody help ??

bfaccini avatar Apr 14 '23 11:04 bfaccini

Can somebody help ??

???

bfaccini avatar Apr 19 '23 21:04 bfaccini

Codecov Report

Merging #5537 (e717f1a) into master (3c623ae) will increase coverage by 0.56%. The diff coverage is 20.00%.

@@            Coverage Diff             @@
##           master    #5537      +/-   ##
==========================================
+ Coverage   71.62%   72.18%   +0.56%     
==========================================
  Files         162      146      -16     
  Lines       24256    22655    -1601     
  Branches        0     3778    +3778     
==========================================
- Hits        17373    16354    -1019     
+ Misses       6883     6301     -582     

codecov[bot] avatar Apr 20 '23 07:04 codecov[bot]

I restarted the CI and most things pass, and the things that fail seem unrelated:


2023-04-13T08:00:40.9198249Z obj_basic_integration/TEST7: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:40.9205687Z RUNTESTS: stopping: obj_basic_integration/TEST7 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.0181676Z obj_basic_integration/TEST8: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.0189683Z RUNTESTS: stopping: obj_basic_integration/TEST8 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.1141737Z obj_basic_integration/TEST9: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.1149866Z RUNTESTS: stopping: obj_basic_integration/TEST9 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.2133923Z obj_basic_integration/TEST10: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.2141758Z RUNTESTS: stopping: obj_basic_integration/TEST10 failed, TEST=check FS=none BUILD=debug

Can you rebase to see if this goes away?

pbalcer avatar Apr 20 '23 07:04 pbalcer

I restarted the CI and most things pass, and the things that fail seem unrelated:


2023-04-13T08:00:40.9198249Z obj_basic_integration/TEST7: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:40.9205687Z RUNTESTS: stopping: obj_basic_integration/TEST7 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.0181676Z obj_basic_integration/TEST8: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.0189683Z RUNTESTS: stopping: obj_basic_integration/TEST8 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.1141737Z obj_basic_integration/TEST9: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.1149866Z RUNTESTS: stopping: obj_basic_integration/TEST9 failed, TEST=check FS=none BUILD=debug
2023-04-13T08:00:41.2133923Z obj_basic_integration/TEST10: pmemdetect: /dev/dax0.2 -- permission denied
2023-04-13T08:00:41.2141758Z RUNTESTS: stopping: obj_basic_integration/TEST10 failed, TEST=check FS=none BUILD=debug

Can you rebase to see if this goes away?

Unfortunately, this is a known issue (platform setup issue), it is not related to your PR. We are working on it.

wlemkows avatar Apr 20 '23 07:04 wlemkows

Can you rebase to see if this goes away?

Unfortunately, this is a known issue (platform setup issue), it is not related to your PR. We are working on it.

But should I rebase finally ?

bfaccini avatar Apr 20 '23 12:04 bfaccini

No, everything seems fine.

LGTM overall.

pbalcer avatar Apr 21 '23 08:04 pbalcer