setup gets stuck
On two very different systems I'm getting stuck at the same place in the setup.
mantis setup
...
Merging profiles in /lizardfs/erikg/miniconda3/lib/python3.8/site-packages/References/NCBI/986/to_merge/
Concatenating files into /lizardfs/erikg/miniconda3/lib/python3.8/site-packages/References/NCBI/986/986_merged.hmm
The setup process simply hangs. On my laptop, I had to kill it. But, on a remote server I'll wait to see if it progresses. Nothing is running as far as htop says and no data is being written.
Hello @ekg I'm not entirely sure what the issue could be as this is only a concatenation method.
- What OS are you using in your systems?
- Is
986_merged.hmmnot being written at all? - Could you perhaps have some permission issues?
- When you killed the process on your laptop, what was the traceback?
Regards, Pedro
The remote system is a debian one in octopus, while the local one is a Ryzen laptop. So they have in common the fact that they are recent AMD systems. Otherwise, they couldn't be less similar up to the point that they're both debian-based (the laptop runs Ubuntu 22.04 (linux 5.18), the server is probably on a recent debian stable (linux 4.19).
The remote one has progressed. But, it seems very strange that nothing is happening. One filesystem is a local SSD, the other is a networked storage system. The common behavior on both suggests it should be possible to reproduce?
On both, I'm installing mantis using conda: conda install -c bioconda -c conda-forge mantis_pfa
What do you mean by The remote one has progressed. But, it seems very strange that nothing is happening?
The HMM for taxa id 986 is also pretty small, so I'm not sure why it would take that long.
The code for concatenation is pretty simple as well:
def concat_files(output_file, list_file_paths, stdout_file=None):
print('Concatenating files into ', output_file, flush=True, file=stdout_file)
with open(output_file, 'wb') as wfd:
for f in list_file_paths:
with open(f, 'rb') as fd:
shutil.copyfileobj(fd, wfd)
# forcing disk write
wfd.flush()
os.fsync(wfd.fileno())
It might be that the flushing or fsync is causing hanging, but I'm not sure why that would be the case. At the moment I don't have time to dive into this but I'll try to reproduce this by the end of the month and will see if there's a better solution to the concatenation.
It might be possible that there's write competition between the multiple cores in your system. Could you try to run the setup with a single core for NCBI?