otp icon indicating copy to clipboard operation
otp copied to clipboard

integer and floats as keys may cause mnesia_locker to crash the node on < OTP-24.3.3

Open onno-vos-dev opened this issue 3 years ago • 4 comments

Describe the bug When trying to insert two keys into a table in a single transaction where one key is 1 and the other key is 1.0 mnesia_locker explodes and causes the node to go down.

To Reproduce

  {ok, _} = application:ensure_all_started(mnesia),
  Tab = testing_set_table,
  StoreType = disc_copies,
  TabDef =
    [{type, set},
     {attributes, [key, non_key, data1, data2]},
     {load_order, 0},
     {index, []},
     {storage_properties, []},
     {record_name, test_record}],
  {atomic, ok} = mnesia:create_table(Tab, [{StoreType, [node()]} | TabDef]),
  mnesia:transaction(fun() ->
                        mnesia:write(Tab, {test_record, 1, 1, 1, 1}, write),
                        mnesia:write(Tab, {test_record, 1.0, 2, 2, 2}, write)
                     end).

Causes the following stacktrace:

 ** FATAL ** mnesia_locker crashed: {function_clause,
                                     [{mnesia_locker,set_lock,
                                       [{tid,234,<0.107.0>},
                                        {testing_set_table,1.0},
                                        write,
                                        [{{testing_set_table,1},
                                          write,
                                          [{write,{tid,234,<0.107.0>}}]}]],
                                       [{file,"mnesia_locker.erl"},
                                        {line,262}]},
                                      {mnesia_locker,grant_lock,5,
                                       [{file,"mnesia_locker.erl"},
                                        {line,353}]},
                                      {mnesia_locker,try_lock,6,
                                       [{file,"mnesia_locker.erl"},
                                        {line,297}]},
                                      {mnesia_locker,loop,1,
                                       [{file,"mnesia_locker.erl"},
                                        {line,162}]},
                                      {mnesia_sp,init_proc,4,
                                       [{file,"mnesia_sp.erl"},{line,34}]},
                                      {proc_lib,init_p_do_apply,3,
                                       [{file,"proc_lib.erl"},{line,226}]}]} state: [<0.525.0>]

This is unexpected since the following code snippet works as expected and produces two distinct records in the DB. The key difference here being that the two records are written in separate transactions.

  {ok, _} = application:ensure_all_started(mnesia),
  Tab = testing_set_table,
  StoreType = disc_copies,
  TabDef =
    [{type, set},
     {attributes, [key, non_key, data1, data2]},
     {load_order, 0},
     {index, []},
     {storage_properties, []},
     {record_name, test_record}],
  {atomic, ok} = mnesia:create_table(Tab, [{StoreType, [node()]} | TabDef]),
  {atomic, ok} = mnesia:transaction(fun() -> mnesia:write(Tab, {test_record, 1, 1, 1, 1}, write) end),
  {atomic, ok} = mnesia:transaction(fun() -> mnesia:write(Tab, {test_record, 1.0, 2, 2, 2}, write) end),
  {atomic, [{test_record, 1, 1, 1, 1}]} = mnesia:transaction(fun() -> mnesia:read(Tab, 1) end),
  {atomic, [{test_record, 1.0, 2, 2, 2}]} = mnesia:transaction(fun() -> mnesia:read(Tab, 1.0) end).

Expected behavior The node doesn't go down

Affected versions The OTP versions that are affected by this bug:

  • 23.3.4.9
  • 24.2.1

Additional context It seems it was fixed as part of: https://github.com/erlang/otp/commit/8988ef4e091b2ba5aa72a1dd951020f6e19ea290 as this issue no longer presents itself since that commit (Tested with: 24.3.4.5 and 25.1).

Perhaps a test could be added at least which prevents this sort of bug from re-occurring if the mnesia locker is changed in the future as it's not clear that this bug was intentionally fixed?

onno-vos-dev avatar Sep 26 '22 13:09 onno-vos-dev

This was a known bug see #3109. If this was fixed it was unintentionally :-) I'll investigate and add a testcase to confirm if that is the case.

dgud avatar Sep 28 '22 08:09 dgud

Thank you @dgud for confirming it was unintentionally fixed :smile: I didn't find the issue that you linked but great that you found it! The module to replicate it seems to have been lost and the linked issue to https://bugs.erlang.org/browse/ERL-50 links back to Github so can't view it there either. Hopefully the snippet I provide helps. :+1:

onno-vos-dev avatar Sep 28 '22 08:09 onno-vos-dev

It still is an bad idea to mix integers and floats as keys in erlang, which is why the issue have been lingering for so long. So you should probably fix that in your code.

Erlangs problematic float, int issue example:

4> lists:sort([0,0.0]).
[0,0.0]
5> lists:sort([0.0,0]).
[0.0,0]

dgud avatar Sep 28 '22 09:09 dgud

I agree for sure that it's a bad idea and to be clear we're not mixing these as keys :-) It however appeared on a property based test where we compare Postgres vs Mnesia results hence the report :-)

onno-vos-dev avatar Sep 28 '22 09:09 onno-vos-dev

Nope this still is not solved, it just doesn't crash as before. We need an option to ordered_set ets tables to have the same sort order as maps. Ping @KennethL (erts task).

dgud avatar Nov 21 '22 12:11 dgud