Falk Amelung

Results 98 comments of Falk Amelung

We sort-of have solved this using the `process.log`. All we need is some shall commands/functions that look at the process*.logs. There are some functions already in `accounts/alias.bash` (`lista` is sophisicated)....

To what I can tell everything works fine (there are still ISCE issues but this is different). It would be nice to add a `submit_jobs.cfg` where all parameters are set...

MakranBigSenAT13 : another case with the same problem. The error shows up in a run_10 job. ``` grep FileNotFoundError out_run_10_filter_coherence.e FileNotFoundError: [Errno 2] No such file or directory: '/tmp/merged/interferograms/20180705_20180810/fine.int.xml' FileNotFoundError:...

The function that fails is [/TopsProc/runBurstIfg.py](https://github.com/isce-framework/isce2/blob/7292d56a496b07dda9c0d152bbb43978ba1e9979/components/isceobj/TopsProc/runBurstIfg.py#L106-L134). After copying `/reference` to `/tmp` it fails in [slc2.createImage()](https://github.com/isce-framework/isce2/blob/7292d56a496b07dda9c0d152bbb43978ba1e9979/components/isceobj/TopsProc/runBurstIfg.py#L115) and, more precisely, it fails in [DataAccessor/DataAccessorPy.py](https://github.com/isce-framework/isce2/blob/7292d56a496b07dda9c0d152bbb43978ba1e9979/components/iscesys/ImageApi/DataAccessor/DataAccessorPy.py#L167-L170). I get the message ``` GDAL open (R):...

January 5 Hi Falk, 1) Yes, you need to change the stripe count before you create/copy the file. That is how the lustre file system works. 2) If the files...

Januar 11 2021 Hi Falk, We spent some time going through the system log and track what raised the problem. Here is a list of potential jobs that probably caused...

January 20 2021 Hi Falk, There are two typical IO workload issues. 1) Heavy MDS workload. It is normally caused by the high-frequency IO requests (like thousands of file open/close/stat...

Jan 21 1. I have asked our system administrators to reactivate your account. --- But please make sure that you have the setup we discussed before (ooops, python_cacher, striped large...

Initial queue access to Stampede2 and Frontera was blocked because of too many simultaneous `wget` processes and because we did not use python_cacher. More recent blocks were because of too...

Feb 1 Hi Falk, 1) For your $WORK usage: The problem is that some of your jobs were causing problems with too many requests to /work with python. Our system...