otp icon indicating copy to clipboard operation
otp copied to clipboard

SSH: missing entry in channel cache

Open etnt opened this issue 4 months ago • 6 comments

https://github.com/erlang/otp/blob/6d4731b9b5c70686bfaea67f2cd6f4f912e6da06/lib/ssh/src/ssh_connection.erl#L1029

Look at the line above where the code try to access: Channel#channel.remote_id I seem to have a case where Channel turns up as undefined , hence causing a crash here.

I prepared the ssh_connection.erlcode like this:

handle_msg(#ssh_msg_channel_request{recipient_channel = ChannelId,
                                    request_type = "exit-signal",
                                    want_reply = false,
                                    data = Data},
           #connection{channel_cache = Cache} = Connection0, _) ->
    <<?DEC_BIN(SigName, _SigLen),
      ?BOOLEAN(_Core),
      ?DEC_BIN(Err, _ErrLen),
      ?DEC_BIN(Lang, _LangLen)>> = Data,
    Channel = ssh_client_channel:cache_lookup(Cache, ChannelId),
    io:format(">>> ~p(~p): ChannelId=~p , Channel=~p~n",[?MODULE,?LINE,ChannelId,Channel]),
    io:format(">>> ~p(~p): CHANNEL-CACHE:~n~p~n",[?MODULE,?LINE,ets:tab2list(Cache)]),
    RemoteId =  Channel#channel.remote_id,
    io:format(">>> ~p(~p): RemoteId=~p~n",[?MODULE,?LINE,RemoteId]),
    {Reply, Connection} =  reply_msg(Channel, Connection0,
                                     {exit_signal, ChannelId,
                                      binary_to_list(SigName),
                                      binary_to_list(Err),
                                      binary_to_list(Lang)}),
    CloseMsg = channel_close_msg(RemoteId),
    {[{connection_reply, CloseMsg}|Reply], Connection};

When running, I see this:

>>> ssh_connection(605): ChannelId=1 , Channel=undefined
>>> ssh_connection(606): CHANNEL-CACHE:
[{channel,"session","subsystem",<0.852.0>,undefined,0,647044,8316,32768,false,
         0,2096622,32768,false,
         {[],[]}}]

Note: the Channel is undefined , so it seems like the code is crashing at the record reference of the remote_id since we don't get the last printout of the RemoteId.

A little about my setup, which is a bit complicated...

  1. I'm setting up an SSH connection from one Erlang node to another (well not quite, see below)
  2. The connection is not "direct"; it is setup toward an external OpenSSH server and we are running a NETCONF subsystem module (that in its turn talks TCP toward the other Erlang node).
  3. We make a Long Running request over the connection on Channel(0) and then a Short Running request over the same connection using Channel(1). It is when the Short Running request is finished this happen causing the Channel(0) to also be closed thus interrupting the Long Running request.

It's kind of a bad description perhaps but anyway, this fact that the code can crash if nothing is found in the Channel Cache doesn't seem good, or?

I can add that in the body of that exit-signal message I get "PIPE" which corresponds to one of the signal names in RFC-4254.

etnt avatar Oct 11 '24 10:10 etnt