PopPUNK
PopPUNK copied to clipboard
Can't use absolute paths in --create-db (poppunk==2.5.0)
Dear developers, I'm i having a little trouble using absolute paths to run --create-db.
When i run with --output ABSPATH i get an error, but the same does not happen when i use RELATIVEPATH like in the tutorial.
I did a quick fix by moving the db after its creation, but i thought it would be cool to report the issue here.
Using abs path (error):
$ poppunk --create-db --output /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db --r-files /home/hugo/projects/reparoma/data/rlist.txt --plot-fit 10 --threads 3
--min-k 14 --max-k 29
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v2.0.0
sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)
Graph-tools OpenMP parallelisation enabled: with 3 threads
Mode: Building new database from input sequences
Sketching 10 genomes using 3 thread(s)
Progress (CPU): 10 / 10
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 3 thread(s)
Progress (CPU): 100.0%
Traceback (most recent call last):
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/bin/poppunk", line 11, in <module>
sys.exit(main())
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/__main__.py", line 317, in main
distMat = queryDatabase(rNames = seq_names,
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/sketchlib.py", line 559, in queryDatabase
plot_fit(klist,
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/PopPUNK/plot.py", line 126, in plot_fit
plt.savefig(out_prefix + ".pdf", bbox_inches='tight')
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/pyplot.py", line 944, in savefig
res = fig.savefig(*args, **kwargs)
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/figure.py", line 3277, in savefig
self.canvas.print_figure(fname, **kwargs)
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 2338, in print_figure
result = print_method(
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 2204, in <lambda>
print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backends/backend_pdf.py", line 2808, in print_pdf
file = PdfFile(filename, metadata=metadata)
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/backends/backend_pdf.py", line 713, in __init__
fh, opened = cbook.to_filehandle(filename, "wb", return_opened=True)
File "/home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/matplotlib/cbook/__init__.py", line 492, in to_filehandle
fh = open(fname, flag, encoding=encoding)
FileNotFoundError: [Errno 2] No such file or directory: '/home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db//home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db_fit_example_1.pdf'
Using relative path (works):
$ poppunk --create-db --output $(basename /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db) --r-files /home/hugo/projects/reparoma/data/rlist.txt --plot-fit 10 --threads 3 --min-k 14 --max-k 29 && mv -v $(basename /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db) /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v2.0.0
sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)
Graph-tools OpenMP parallelisation enabled: with 3 threads
Mode: Building new database from input sequences
Sketching 10 genomes using 3 thread(s)
Progress (CPU): 10 / 10
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 3 thread(s)
Progress (CPU): 100.0%
Done
renamed 's_genus_poppunk_db' -> '/home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db'
obs: i'm using basename to get relative paths bc this CLs are create by snakemake.
Thank you for the detailed report.
This is due to an error here: https://github.com/bacpop/PopPUNK/blob/master/PopPUNK/sketchlib.py#L564
Should be ref_db
and not dbPrefix
I can fix this in the next release.
If you are able, I would appreciate it if you could re-run the version with absolute paths but omitting the --plot-fit
argument, just to double-check this is the only issue.
Hello, thank you for the quick reply !
I can fix this in the next release.
Thanks !
If you are able, I would appreciate it if you could re-run the version with absolute paths but omitting the --plot-fit argument, just to double-check this is the only issue.
Yeah the problem was indeed with the --plot-fit argument, without it the run is completed regardless of the ABSPATH:
$ poppunk --create-db --output /home/hugo/projects/reparoma/results/poppunk/s_genus_poppunk_db --r-files /home/hugo/projects/reparoma/data/rlist.txt --threads 3 --min-k 14 --max-k 29
Activating conda environment: .snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
(with backend: sketchlib v2.0.0
sketchlib: /home/hugo/projects/reparoma/.snakemake/conda/6b1d5b960a33e4a9c375ef19f5bc27fe/lib/python3.10/site-packages/pp_sketchlib.cpython-310-x86_64-linux-gnu.so)
Graph-tools OpenMP parallelisation enabled: with 3 threads
Mode: Building new database from input sequences
Sketching 10 genomes using 3 thread(s)
Progress (CPU): 10 / 10
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 3 thread(s)
Progress (CPU): 100.0%
Done