pypgx icon indicating copy to clipboard operation
pypgx copied to clipboard

TypeError: DataFrame.merge() missing 1 required positional argument: 'right'` in prepare-depth-of-coverage

Open Svenvdm opened this issue 9 months ago • 8 comments

Hi Steven

Thank you for the great work on this tool. I'm currently trying to set it up but I'm encountering some issues running the ngs pipeline (v0.24.0). As per the tutorial I first tried to generate the input files, but get following error messages:

Traceback (most recent call last): File "/home/inst/user/.conda/envs/pypgx/bin/pypgx", line 8, in <module> sys.exit(main()) ^^^^^^ File "/home/inst/user/.conda/envs/pypgx/lib/python3.12/site-packages/pypgx/__main__.py", line 33, in main commands[args.command].main(args) File "/home/inst/user/.conda/envs/pypgx/lib/python3.12/site-packages/pypgx/cli/prepare_depth_of_coverage.py", line 90, in main archive = utils.prepare_depth_of_coverage( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/inst/user/.conda/envs/pypgx/lib/python3.12/site-packages/pypgx/api/utils.py", line 1297, in prepare_depth_of_coverage regions = create_regions_bed( ^^^^^^^^^^^^^^^^^^^ File "/home/inst/user/.conda/envs/pypgx/lib/python3.12/site-packages/pypgx/api/utils.py", line 779, in create_regions_bed bf = bf.merge() ^^^^^^^^^^ File "/home/inst/user/.conda/envs/pypgx/lib/python3.12/site-packages/fuc/api/pybed.py", line 472, in merge return self.__class__(self.copy_meta(), self.gr.merge()) ^^^^^^^^^^^^^^^ TypeError: DataFrame.merge() missing 1 required positional argument: 'right'

When doing the following commands: pypgx prepare-depth-of-coverage sample-depth-of-coverage.zip /path/to/my/bamfile

I get a similar error message for the create-input-vcf command: 'pypgx create-input-vcf --assembly GRCh38 variants.vcf.gz /path/to/my/reference /path/to/my/bamfile'

Am I making any mistake on the commands, or would there be an other issue? Thanks in advance for your input!

Kind regards Sven

Svenvdm avatar May 02 '24 17:05 Svenvdm

@Svenvdm,

Thanks for your interest in PyPGx!

I just ran the prepare-depth-of-coverage command using data from the GeT-RM tutorial and it worked fine:

(fuc) sbslee@Seung-beens-MacBook-Air getrm-wgs-tutorial % pypgx prepare-depth-of-coverage \
grch37-depth-of-coverage.zip \
grch37-bam/*.bam
[W::hts_idx_load3] The index file is older than the data file: grch37-bam/HG00276_PyPGx.sorted.markdup.recal.bai
...
[W::hts_idx_load3] The index file is older than the data file: grch37-bam/NA12003_PyPGx.sorted.markdup.recal.bai
Saved CovFrame[DepthOfCoverage] to: grch37-depth-of-coverage.zip
  1. Please provide the exact command lines you used.
  2. Please provide the entire error messages.
  3. What's the output of $ conda list? The errors may be related to incorrect package versions.

sbslee avatar May 02 '24 22:05 sbslee

Hi @sbslee

Thank you for your quick response! One of my suspicions is indeed that there is an incorrect package version. I was unable to install pypgx using bioconda (I can make a separate issue for this if necessary). I resolved this by first installing python, then "pip install pypgx". The stack trace makes me think that it is an issue with the pandas version, but I'm not sure.

  1. exact command lines: `pypgx prepare-depth-of-coverage depth-of-coverage.zip /home/vito/vdmaass/projects/PrecisionHealth/PGx/benchmark/data/get-rm/bamfiles/readyforanalysis/ERR1955323_HG00276_sorted_RG_added_marked_dup.bam

pypgx compute-control-statistics VDR control-statistics-VDR.zip /home/vito/vdmaass/projects/PrecisionHealth/PGx/benchmark/data/get-rm/bamfiles/readyforanalysis/ERR1955323_HG00276_sorted_RG_added_marked_dup.bam

pypgx create-input-vcf --assembly GRCh38 variants.vcf.gz ${reference} /home/vito/vdmaass/projects/PrecisionHealth/PGx/benchmark/data/get-rm/bamfiles/readyforanalysis/ERR1955323_HG00276_sorted_RG_added_marked_dup.bam`

  1. error message is attached. error.txt

  2. I've attached the output of conda list here as well.

conda_list.txt

Svenvdm avatar May 03 '24 08:05 Svenvdm

@Svenvdm,

Thanks for providing the requested information.

I was unable to install pypgx using bioconda (I can make a separate issue for this if necessary).

Could you provide what you tried and the issue? I just successfully installed pypgx using bioconda:

$ conda create -n test bioconda::pypgx

I strongly recommend that you stick to conda when it comes to installing pypgx.

I resolved this by first installing python, then "pip install pypgx". The stack trace makes me think that it is an issue with the pandas version, but I'm not sure.

I think the most likely culprit is pyranges. Your version is 0.1.0 while mine 0.0.129. I think the package has a different version numbering system between pypi and conda. For whatever reason, if you can't install pypgx via conda, please try to install as many other packages as possible using conda. I'll attach the output of conda list for the pypgx environment I just created.

test.txt

In short, first, please try to install pypgx via conda again. If it still doesn't work, at least try to install pyranges via conda.

sbslee avatar May 03 '24 09:05 sbslee

Hi @sbslee

I've tried again to install pypgx using bioconda and when trying to verify that the installation was correct by doing the pypgx -v command I get the following stacktrace:

Traceback (most recent call last): File "/home/vito/vdmaass/.conda/envs/pypgx2/bin/pypgx", line 6, in <module> from pypgx.__main__ import main File "/home/vito/vdmaass/.conda/envs/pypgx2/lib/python3.7/site-packages/pypgx/__main__.py", line 4, in <module> from .cli import commands File "/home/vito/vdmaass/.conda/envs/pypgx2/lib/python3.7/site-packages/pypgx/cli/__init__.py", line 9, in <module> commands[f.stem.replace('_', '-')] = import_module(f'.{f.stem}', __package__) File "/home/vito/vdmaass/.conda/envs/pypgx2/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/home/vito/vdmaass/.conda/envs/pypgx2/lib/python3.7/site-packages/pypgx/cli/compute_control_statistics.py", line 29, in <module> """ AttributeError: module 'fuc.api.common' has no attribute '_script_name'

Would you be able to provide a yaml file of your conda environment so I can replicate it with your exact environment? Thanks in advance.

Kind regards Sven

Svenvdm avatar May 03 '24 15:05 Svenvdm

@Svenvdm,

How are you installing pypgx via conda? Are you installing it within an existing environment? I strongly suggest that you create a fresh environment just for pypgx:

$ conda create -n pypgx-env bioconda::pypgx

sbslee avatar May 03 '24 21:05 sbslee

Hi @sbslee

I'm indeed creating a fresh environment solely dedicated for the pypgx package. I've managed to install it correctly on my local machine via WSL (ubuntu 22.04). However, I still get the same error mentioned above when creating this enviornment on my local linux computing cluster.

Kind regards Sven

Svenvdm avatar May 06 '24 08:05 Svenvdm

@Svenvdm,

However, I still get the same error mentioned above when creating this enviornment on my local linux computing cluster.

This may be because of hardware limitations on your cluster. For example, I notice that the Python version installed to your cluster is 3.7, which is fairly low. Also, as you can see from #60, the error message AttributeError: module 'fuc.api.common' has no attribute '_script_name' is result of having an old version of the fuc package. At this point, my advice is to try installing the latest versions of the dependencies listed here.

sbslee avatar May 07 '24 04:05 sbslee

Hi @sbslee

Thanks for your suggestion. I indeed think the lower python version might be the issue. I will further test this and let you know if this works.

Kind regards Sven

Svenvdm avatar May 07 '24 08:05 Svenvdm

Closed due to inactivity. Please feel free to re-open it if necessary.

sbslee avatar May 14 '24 08:05 sbslee