otp icon indicating copy to clipboard operation
otp copied to clipboard

gen_tcp_socket:shutdown crash when the gen_statem shutdown normally

Open zzydxm opened this issue 2 years ago • 4 comments

Describe the bug gen_tcp_socket:shutdown crash when the gen_statem shutdown normally (e.g. when the other end closed the socket, or gen_tcp_socket:close(Socket) is called).

This especially introduced a crash in tls_connection_1_3:init:

<0.17792.7401> proc_lib:539 === =CRASH REPORT==== 27-Sep-2022::14:04:25 ===
  crasher:
    initial call: ssl_gen_statem:init/1
    pid: <0.17792.7401>
    registered_name: []
    exception exit: {normal,
                        {gen_statem,call,
                            [<0.202.0>,{shutdown,write},infinity]}}
      in function  gen:do_call/4 (gen.erl, line 243)
      in call from gen_statem:call_dirty/4 (gen_statem.erl, line 890)
      in call from gen_tcp_socket:call/2 (gen_tcp_socket.erl, line 1134)
      in call from gen_tcp_socket:shutdown/2 (gen_tcp_socket.erl, line 462)
      in call from tls_gen_connection:close/4 (tls_gen_connection.erl, line 498)
      in call from gen_statem:terminate/7 (gen_statem.erl, line 2588)
      in call from tls_connection_1_3:init/1 (tls_connection_1_3.erl, line 194)
    ancestors: [<0.7225.7095>,tls_dist_connection_sup,tls_dist_sup,
                  ssl_dist_sup,net_sup,kernel_sup,<0.47.0>]
    message_queue_len: 1
    messages: [{'DOWN',#Ref<0.3687079794.327680041.76039>,process,
                          <0.200.0>,normal}]
    links: [<0.7225.7095>]
    dictionary: [{ssl_pem_cache,ssl_pem_cache_dist},
                  {ssl_manager,ssl_manager_dist}]
    trap_exit: true
    status: running
    heap_size: 610
    stack_size: 35
    reductions: 128881
  neighbours:

To Reproduce Terminal 1:

{ok,L}=gen_tcp_socket:listen(33333,[]),
{ok,S}=gen_tcp_socket:accept(L,infinity),
{_,_,{P,_}}=S,
timer:sleep(10),
P!{system,{self(),make_ref()},{terminate,normal}},
gen_tcp_socket:shutdown(S,write).

Terminal 2:

gen_tcp_socket:connect({127,0,0,1},33333,[],infinity).

Output:

=ERROR REPORT==== 27-Sep-2022::17:41:14.546506 ===
gen_tcp_socket call failed:
      Call:  {shutdown,write}
      Class: exit
      Error: {normal,{gen_statem,call,[<0.88.0>,{shutdown,write},infinity]}}
      Stack: [{gen,do_call,4,[{file,"gen.erl"},{line,243}]},
              {gen_statem,call_dirty,4,[{file,"gen_statem.erl"},{line,890}]},
              {gen_tcp_socket,call,2,
                              [{file,"gen_tcp_socket.erl"},{line,1134}]},
              {gen_tcp_socket,shutdown,2,
                              [{file,"gen_tcp_socket.erl"},{line,462}]},
              {erl_eval,do_apply,7,[{file,"erl_eval.erl"},{line,748}]},
              {shell,exprs,7,[{file,"shell.erl"},{line,691}]},
              {shell,eval_exprs,7,[{file,"shell.erl"},{line,647}]},
              {shell,eval_loop,3,[{file,"shell.erl"},{line,632}]}]

Expected behavior gen_tcp_socket:shutdown(S,write) should return {error, closed} in this case instead of crash

Affected versions 25.0.2

Additional context adding case of {normal,{gen_statem,call,[,{shutdown,},_]}} here should solve this problem https://github.com/erlang/otp/blob/master/lib/kernel/src/gen_tcp_socket.erl#L1137

zzydxm avatar Sep 28 '22 00:09 zzydxm

An alternative could be to change the call to gen_statem:stop(Server) on line 1148 to gen_statem:stop(Server, {shutdown,closed}, infinity) which I think is more correct (but I have not tried it). The server should never exit normal.

RaimoNiskanen avatar Sep 30 '22 16:09 RaimoNiskanen

Yes I have tested, doing gen_statem:stop(Server, {shutdown, closed}, infinity) should solve the problem.

zzydxm avatar Oct 03 '22 19:10 zzydxm

Do you have a simple, not "hacky", way to test this? I could not reproduce by closing the remote side, before shutdown.

bmk avatar Oct 07 '22 14:10 bmk

Do you have a simple, not "hacky", way to test this? I could not reproduce by closing the remote side, before shutdown.

Only closing the remote side won't gen_statem:stop the process. Only if send/receive sees some error, it will crash, so P ! {system, ..} can be replaced by a spawned recv, but not very reliably reproducible.

zzydxm avatar Oct 10 '22 17:10 zzydxm

ok

bmk avatar Nov 16 '22 15:11 bmk

Solution merged to maint (and master). Will be part of the maint patch (maybe 25.1.3)

bmk avatar Nov 30 '22 09:11 bmk