mdsplus
mdsplus copied to clipboard
TreeGetRecord() returns a misleading error code
Affiliation MIT PSFC
Version(s) Affected Found in recent "alpha", but surely exists in all other versions too
Platform Found on Ubuntu 20, but is undoubtedly on all platforms
Describe the bug
The TreeGetRecord()
function in TreeGetRecord.c
of the treeshr
library returns a zero error code in some instances. When the zero error code is displayed as a text message it appears as %SS-W-SUCCESS, Success
. Note that the "W" denotes a "Warning", which contradicts the status of "Success". And because the severity level is not set to success (i.e., low order bit is not set), the calling code will instead treat this as a failure. (The STATUS_OK
macro checks the low-order bit.)
The questionable construct is as follows:
status = TreeCallHook( <args> )
if (status && STATUS_NOT_OK)
return 0;
And it appears at these locations: line 100 line 114
To Reproduce
Create a bug.py
file with the following Python code.
import MDSplus as mds
import concurrent.futures as cf
def get_data(path):
return tree.getNode(path).getData().data()
# Main program -- bug not specific to this shot and signals
shot_num = 1090909009
paths = [
"\CMOD::TOP.ENGINEERING.POWER_SYSTEM:TEST",
"\CMOD::TOP.ENGINEERING.RGA_1:AFTER",
"\CMOD::TOP.ENGINEERING.RGA_1:BEFORE",
"\CMOD::TOP.SPECTROSCOPY.BOLOMETER:TWOPI_DIODE",
"\CMOD::TOP.SPECTROSCOPY.BOLOMETER:TWOPI_FOIL"
]
tree = mds.Tree("cmod", shot_num, mode="readonly")
# Works with one thread, fails with multiple threads
with cf.ThreadPoolExecutor(max_workers=2) as executor:
futures = [executor.submit(get_data, node_path) for node_path in paths]
for future in cf.as_completed(futures):
print(future.result())
Running the above bug.py
program produces the following output. The XMW
message is from a debug print statement that was added to the python/MDSplus/tree.py
file in the getRecord()
function prior to raising the exception. Notice the %SS-W-SUCCESS, Success
message at the bottom of the output.
python3 bug.py
Error in SendArg: mode = 6, status = 65554
XMW tree.py, getRecord() before raise exception, status = 0
Error in SendArg: mode = 6, status = 65554
XMW tree.py, getRecord() before raise exception, status = 0
Error in SendArg: mode = 6, status = 65554
XMW tree.py, getRecord() before raise exception, status = 0
D, 1832925:1832943, 1698942711.128151250: <path>/mdsplus/mdstcpip/mdsipshr/MdsIpThreadStatic.c:48 buffer_free() Connection(id=2, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
D, 1832925:1832944, 1698942711.133187987: <path>/mdsplus/mdstcpip/mdsipshr/MdsIpThreadStatic.c:48 buffer_free() Connection(id=3, state=0x80, protocol='tcp', info_name='tcp', version=3, user='(null)')
Traceback (most recent call last):
File "bug.py", line 25, in <module>
print(future.result())
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "bug.py", line 7, in get_data
return tree.getNode(path).getData().data()
File "/home/mwinkel/x_markw/xmw_build/python/MDSplus/tree.py", line 1976, in getRecord
raise _exc.MDSplusException(status)
MDSplus.mdsExceptions.SsSUCCESS: %SS-W-SUCCESS, Success
Exception ignored in: <function DescriptorXD.__del__ at 0x7fa3f4539a60>
Traceback (most recent call last):
Expected behavior
Because "tree hooks" are optional user-provided routines, the intent of the code is probably to ignore a missing "tree hook" and just return success. In which case it should instead return the official code for success, SsSUCCESS
. This is important because SsSUCCESS
has the low order bit set (i.e., severity level = 1), which is what the STATUS_OK
macro checks.
However, if the intent was instead to signal failure, then the code should be written as return status
or return MDSplusERROR
.
Screenshots n/a
Additional context Thanks to Josh L. of the MIT PSFC Disruptions Team for reporting this bug.
It would likely be useful to add a new error code for zero. This would have to be added to the Ss
facility section of the mdsshr_messages.xml
file. Perhaps it should be as follows:
<status name="ZERO" value="0" severity="Warning" text="Zero is an invalid error code"/>
Note that the resulting error code, SsZERO
(= 0) differs from SsSUCCESS
(= 1) only in the severity field.
However, this would likely be problematic -- because then both "SsSUCCESS" and "SsZERO" would have the same "value" field (the message system likely doesn't allow duplicates). And furthermore, the MdsGetStdMsg.c
file of the mdsshr
library would need to be modified to handle the special SsZERO error code.
The reported bug is probably related to the broken (especially python) thread safety (especially mdsip)
Regardless:
Off the top of my head, it should not return zero. The bits are encoded, with the last two bits holding the severity. The only reason Success has no other bits set is that we did not want to call out a particular kind of success.
It may or may not do what you need but I think that: ProcessPoolExecutor will be more likely to succeed.
looks like it should simply returrn status instead of 0.