gemelli icon indicating copy to clipboard operation
gemelli copied to clipboard

TypeError: __init__() missing 1 required positional argument: 'dtype'

Open samd1993 opened this issue 1 year ago • 3 comments

Hi,

Trying to use gemelli as a qiime2 plugin and I get the following error:

/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/preprocessing.py:423: RuntimeWarning: divide by zero encountered in log
  mat = np.log(matrix_closure(matrix_closure(mat) * branch_lengths))
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 2.57 TiB for an array with shape (594002, 594002) and data type float64

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/commands.py", line 478, in __call__
    results = self._execute_action(
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/commands.py", line 539, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-76>", line 2, in phylogenetic_rpca_with_taxonomy
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
    outputs = self._callable_executor_(
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/action.py", line 566, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 88, in phylogenetic_rpca_with_taxonomy
    output = phylogenetic_rpca(table=table,
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 267, in phylogenetic_rpca
    ord_res, dist_res = optspace_helper(rclr_table, fids, table.ids(),
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 514, in optspace_helper
    u, s, v = svd(X)
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/scipy/linalg/_decomp_svd.py", line 127, in svd
    u, s, v, info = gesXd(a1, compute_uv=compute_uv, lwork=lwork,
TypeError: __init__() missing 1 required positional argument: 'dtype'

Plugin error from gemelli:

  __init__() missing 1 required positional argument: 'dtype'

See above for debug info.

Are my file sizes too big? I am running this on a large microbiome table that has thousands of samples. Only using about 50% of the memory I requested on my server when I run it. Any help would be appreciated!

Also here is the original code I ran:

pip install gemelli
qiime dev refresh-cache

qiime gemelli phylogenetic-rpca-with-taxonomy \
    --i-table ~/TOL/minich/GMTOLsong_table2024_N20_f2all_V4.qza \
    --i-phylogeny ~/TOL/minich/GMTOLsong_rooted_tree2024f2.qza \
    --m-taxonomy-file ~/TOL/minich/merged_GMTOL_taxonomy2024_N20all_f2_V4.qza \
    --p-min-feature-count 10 \
    --p-min-sample-count 500 \
    --o-biplot ~/TOL/minich/GMTOLsong_rpca_biplot.qza \
    --o-distance-matrix ~/TOL/minich/GMTOLsong_rpca_distance_matrix.qza \
    --o-counts-by-node-tree ~/TOL/minich/GMTOLsong_rpca_counts_by_node_tree.qza \
    --o-counts-by-node ~/TOL/minich/GMTOLsong_rpca_counts_by_node_phylotable.qza \
    --o-t2t-taxonomy ~/TOL/minich/GMTOLsong_rpca_taxonomy.qza \
    --verbose

Cheers, Sam

samd1993 avatar Sep 19 '24 20:09 samd1993

Hi @samd1993,

Is your tree large? It may be using too much memory to expand the table. You could try --p-min-depth of 1 or larger.

cameronmartino avatar Sep 26 '24 14:09 cameronmartino

Hi @cameronmartino

Sorry for the late reply (was moving across country).

I tried --p-min-depth 1 as so:

qiime gemelli phylogenetic-rpca-without-taxonomy \
    --i-table ~/TOL/FINAL_FILES/GMTOL_table2.qza \
    --i-phylogeny ~/TOL/FINAL_FILES/GMTOL_rooted_tree2.qza \
    --p-min-feature-count 10 \
    --p-min-sample-count 500 \
    --o-biplot ~/TOL/FINAL_FILES/GMTOLoldJun12_23_rpca_biplot.qza \
    --o-distance-matrix ~/TOL/FINAL_FILES/GMTOLoldJun12_23_rpca_distance_matrix.qza \
    --o-counts-by-node-tree ~/TOL/FINAL_FILES/GMTOLoldJun12_23_rpca_counts_by_node_tree.qza \
    --o-counts-by-node ~/TOL/FINAL_FILES/GMTOLoldJun12_23_rpca_counts_by_node_phylotable.qza \
    --p-min-depth 1 \
    --verbose

and got the same response (leaving here in case there are any differences in the output, although it looks identical:

-bash-4.2$ cat output.log
/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/preprocessing.py:423: RuntimeWarning: divide by zero encountered in log
  mat = np.log(matrix_closure(matrix_closure(mat) * branch_lengths))
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 2.57 TiB for an array with shape (594002, 594002) and data type float64

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/commands.py", line 478, in __call__
    results = self._execute_action(
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/q2cli/commands.py", line 539, in _execute_action
    results = action(**arguments)
  File "<decorator-gen-76>", line 2, in phylogenetic_rpca_with_taxonomy
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable
    outputs = self._callable_executor_(
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/qiime2/sdk/action.py", line 566, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 88, in phylogenetic_rpca_with_taxonomy
    output = phylogenetic_rpca(table=table,
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 267, in phylogenetic_rpca
    ord_res, dist_res = optspace_helper(rclr_table, fids, table.ids(),
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/rpca.py", line 514, in optspace_helper
    u, s, v = svd(X)
  File "/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/scipy/linalg/_decomp_svd.py", line 127, in svd
    u, s, v, info = gesXd(a1, compute_uv=compute_uv, lwork=lwork,
TypeError: __init__() missing 1 required positional argument: 'dtype'

Plugin error from gemelli:

  __init__() missing 1 required positional argument: 'dtype'

See above for debug info.
Finishing gemelli  job

Should I try larger numbers?

samd1993 avatar Oct 17 '24 01:10 samd1993

I also ran gemelli on its own and get a similar error with min depth 100:

gemelli phylogenetic-rpca \
>   --in-biom export/feature-table.biom \
>   --taxonomy export/taxonomy.tsv \
>   --in-phylogeny export/tree.nwk \
>   --output-dir export \
>   --min-feature-count 100 \
>   --min-sample-count 500 \
>   --min-depth 100
/home/sdegregori/miniconda3/envs/qiime2-2023.7/lib/python3.8/site-packages/gemelli/preprocessing.py:423: RuntimeWarning: divide by zero encountered in log
  mat = np.log(matrix_closure(matrix_closure(mat) * branch_lengths))
zsh: killed     gemelli phylogenetic-rpca --in-biom export/feature-table.biom --taxonomy   

samd1993 avatar Oct 17 '24 23:10 samd1993