jwql
jwql copied to clipboard
fixes for no data found on server
After lots of investigation, the apparent cause behind the TA monitors reporting that they are not finding any data is a problem with the log_info decorator function. When I comment that out, the TA monitor successfully produces plots. Within log_info, it looks like the troublesome line is the environment = subprocess.check_output
line. When I run the line manually on my local machine and on the server, the results seem fine. But when I run the TA monitor by opening the webpage, log_info() fails in such a way that the TA monitor's run() function is skipped. When I set environment
to a dummy string within log_info(), everything works and the TA monitor produces plots. And when I wrap the environment line in a try/except, everything also works fine, because it falls into the except clause.
I've also noticed that the MSATA monitor is not creating a new log file at each run, but seems to be overwriting the most recent log file. In my last run via the webpage, the log was placed in a log file with a timestamp that was an hour old.
So it seems like it comes down to some issue with running conda env export
when called from the webpage?
So it seems like it comes down to some issue with running conda env export when called from the webpage?
Thank you SO much for helping with this Bryan! If I understand correctly, this is a subprocess that is started from the java code to update the MSATA/WATA webpage from the magnifying glass to the plots. So this environment is expected to be the same as the original one but this sub environment does not have something that the code needs?
The subprocess is called by log_info(), which is called when the msata run() function is called. My expectation is that this should just be running conda env export
on the environment on the server. It's not at all clear to me what the problem is, since I can make that call successfully on the server in an ipython session.
I'm also curious about why we haven't seen this before. It is true that the other monitors are not completing successfully at the moment due to the memory errors, but I think they are still being called, and they should reach their log_info-wrapped functions before crashing. Or maybe we're not actually calling them at the moment?
@mfixstsci do you have any thoughts about this? Depending on how much effort we want to put into solving this, maybe we should wrap that environment line in a try/except in this PR, and worry about solving the issue in a separate PR?
Another piece to add to the puzzle: I checked the log file for yesterday's dark monitor run on the test server. The environment was successfully retrieved there. So what's different about the dark monitor compared to the TA monitors in this case?
This was in a recent log file, just prior to the environment problem:
WARNING: [Errno 2] No such file or directory: 'conda': 'conda'
It's not clear to me why conda would not be present.
Here are the lines that I added to log_info during testing to get around the error:
try:
environment = subprocess.check_output('conda env export', universal_newlines=True, shell=True) # nosec
except Exception as e:
logging.warning(e)
environment = 'Unable to get environment info\n'
The only difference I can think of between the TA monitors and the other monitors is that the others are all called on the command line in our crontab, while the TA monitors are called as part of the webpage creation process. Could that be important? @mfixstsci @BradleySappington
- [x] Line 246 of monitor_views should be module = 'msata_monitor' rather than module = 'msata_monitor.py'
I just did a new commit to remove the ".py" from the module name in the script monitor_views.py.
- [x] Line 246 of monitor_views should be module = 'msata_monitor' rather than module = 'msata_monitor.py'
Nailed it, good catch @bhilbert4
I think there's just one more change this needs and then it'll be ready to go. I think the best solution at the moment to get around the logging error is to add the try/except clause in the comment above to the logging function. That should allow us to continue to get environment info for the monitors that are run via the command line, and will allow us to gracefully skip collecting that info for the TA monitors that are being called via clicks on the webpage. @penaguerrero
@bhilbert4 I added the try/except block at line 242 of utils/logging_functions.py
@mfixstsci @bhilbert4 I just added some tests for the monitors :)
@penaguerrero THANK YOU FOR WRITING TESTS!