mdsplus icon indicating copy to clipboard operation
mdsplus copied to clipboard

The mdsvalue() function of the IDL API fails if the most recent socket is disconnected

Open mwinkel-dev opened this issue 1 year ago • 1 comments

Affiliation MIT PSFC

Version(s) Affected all versions and for many years

Platform Found on Rocky 9 and Ubuntu 20, but surely exists on all platforms

Describe the bug If the most recent socket is disconnected, the mdsvalue("some_expr") function of the IDL API errors when attempting to evaluate the expression. That is because mdsvalue attempts to use a released socket.

It is unlikely that customers will encounter this bug in normal workflows. That is because it is a "distributed" bug involving several portions of the IDL API: mdsconnect, mdsdisconnect, mdsisclient, mdsvalue and !MDS_SOCKET. The statements have to in a particular sequence to trigger the bug.

The root cause is that the underlying C code in mdsipshr maintains a set of connection ids (sockets), but the IDL API only has two IDL system variables: !MDS_SOCKET and !MDSDB_SOCKET. So even though the underlying C code knows which sockets in the set are still active, the IDL API presently has no idea which sockets are active and which ones have been released.

Stated another way, the IDL API assumes that users will have at most two concurrent sockets: one for a database connection and one for a mdsip server. Thus the IDL API is far more restrictive than the underlying C code in mdsipshr. (However, the IDL API's assumption likely does describe practical workflows, otherwise customers would have reported this bug years ago.)

To Reproduce Steps to reproduce the behavior:

The steps to reproduce the bug have varied over the years (i.e., as fixes have been made to the IDL code). But the central concept is the same: disconnecting the most recent socket causes a subsequent mdsvalue() call to fail.

But first, note that mdsvalue does not require a socket. If none is given, it evaluates the expression with the local MDSplus.

$ idl
IDL 8.6.0 (linux x86_64 m64).
(c) 2016, Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation.

IDL> print, mdsvalue('77')
% Compiled module: MDSVALUE.
% Compiled module: MDSCHECKARG.
% Compiled module: MDSISCLIENT.
% Compiled module: MDS_KEYWORD_SET.
% Compiled module: MDSIDLIMAGE.
          77
IDL> 

The following example is from the 20-Nov-2020 alpha (commit 335dcc8).

### FAILURE ###
$ idl
IDL 8.6.0 (linux x86_64 m64).
(c) 2016, Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation.
 
IDL> ; Using MDSplus alpha from 20-Nov-2020 (commit 335dcc8)
IDL> mdsconnect, 'some_server'                          
% Compiled module: MDSCONNECT.
% Compiled module: MDSDISCONNECT.
IDL> print, !MDS_SOCKET                                      
           1
IDL> mdsdisconnect, socket=1                                 
IDL> mdsvalue('77')                                          
% Compiled module: MDSVALUE.
% Compiled module: MDSCHECKARG.
% Compiled module: MDSISCLIENT.
% MDSVALUE: Error evaluating expression
       0
IDL> 


### SUCCESS ###
$ idl
IDL 8.6.0 (linux x86_64 m64).
(c) 2016, Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation.
 
IDL> ; Using MDSplus alpha from 20-Nov-2020 (commit 335dcc8)
IDL> mdsconnect, 'some_server'                          
% Compiled module: MDSCONNECT.
% Compiled module: MDSDISCONNECT.
IDL> mdsdisconnect                                           
IDL> mdsvalue('77')                                          
% Compiled module: MDSVALUE.
% Compiled module: MDSCHECKARG.
% Compiled module: MDSISCLIENT.
% Compiled module: MDSIDLIMAGE.
          77
IDL> 

This example shows how to trigger the bug with the 9-Oct-2023 alpha (commit 138a468).

IDL> mdsconnect, 'some_server'
% Compiled module: MDSCONNECT.
% Compiled module: MDS_KEYWORD_SET.
% Compiled module: MDSDISCONNECT.
IDL> mdsconnect, 'some_server'
IDL> ; Deleting last socket (socket=1 in this case) causes subsequent mdsvalue to fail
IDL> mdsdisconnect, socket=1  ; no failure if socket=0
IDL> mdsvalue('77')
% Compiled module: MDSVALUE.
% Compiled module: MDSCHECKARG.
% Compiled module: MDSISCLIENT.
% MDSVALUE: Error evaluating expression
       0
IDL> 

Expected behavior Even if mdsvalue should fail if it is using a disconnected socket, it should display an error message explaining that the dead socket is the cause.

Ideally, the entire IDL API will be revamped to mirror the capabilities of the underlying C code in mdsipshr. Because the C code presently maintains the set of currently active sockets, the IDL API should refer to the C code instead of using !MDS_SOCKET and !MDSDB_SOCKET. An added complexity is that whatever is done with the sockets must not break the set_database feature of the IDL API. Rearchitecting the IDL API will be a complicated task, so should not be rushed.

Screenshots n/a

Additional context n/a

Workarounds Until this bug is fixed, these are the possible workarounds:

  • Don't execute any mdsvalue statements after disconnecting the sockets, or
  • Don't disconnect the most recently created socket, or
  • Create an extraneous connection prior to disconnecting any of the prior sockets, or
  • In some scenarios, omitting the optional socket keyword from mdsdisconnect will avoid the issue.

mwinkel-dev avatar Oct 11 '23 20:10 mwinkel-dev

When this bug is fixed, uncomment all IDL-2639-* tests in the IDL test harness (idl/testing/run_tests.py).

mwinkel-dev avatar Oct 19 '23 21:10 mwinkel-dev