PopCOGenT icon indicating copy to clipboard operation
PopCOGenT copied to clipboard

fali to parsing sequences

Open Xiaojun928 opened this issue 3 years ago • 2 comments

Hi,

I meet trouble when running python get_alignment_and_length_bias.py for about 180 genomes. The following is the error:

.Parsing sequences for R2MyF9PMoHcjJAH9 multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/site-packages/joblib/parallel.py", line 130, in __call__
    return self.func(*args, **kwargs)
  File "/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/site-packages/joblib/parallel.py", line 72, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/site-packages/joblib/parallel.py", line 72, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/mnt/home-user/software/PopCOGenT/src/PopCOGenT/length_bias_functions.py", line 26, in align_and_calculate_length_bias
    length_bias_file)
  File "/mnt/home-user/software/PopCOGenT/src/PopCOGenT/length_bias_functions.py", line 110, in calculate_length_bias
    g2size)
  File "/mnt/home-user/software/PopCOGenT/src/PopCOGenT/length_bias_functions.py", line 131, in get_transfer_measurement
    s1temp, s2temp = zip(*filtered_blocks)
ValueError: not enough values to unpack (expected 2, got 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/site-packages/joblib/parallel.py", line 140, in __call__
    raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
ValueError                                         Wed Jul  7 14:32:25 2021
PID: 302719Python 3.6.13: /home-user/miniconda3/envs/PopCOGenT/bin/python
...........................................................................
/home-user/miniconda3/envs/PopCOGenT/lib/python3.6/site-packages/joblib/parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)

I requested 16 threads for this job. It works when I apply it to other datasets with less than 100 genomes. I am not sure if the number of genomes matters. Could you please give me some suggestions?

Thanks in advance! Xiaojun

Xiaojun928 avatar Jul 07 '21 08:07 Xiaojun928

Hello, thanks for opening the issue!

It looks like the filtered_blocks variable may have been empty. Looking at the code, it might mean that the alignment was empty. I'm wondering if it's something simple - do you have any genomes in this set you are running that are actually empty or really different than the rest?

elsherbini avatar Jul 07 '21 12:07 elsherbini

Hi,

Thanks for the quick response! No genome is empty in my dataset. All of them are simulated SAGs derived from isolates genomes (i.e., I randomly imported some variants ~1/10Kbp, fragmented some contigs, and randomly filtered some sequences to meet an extent of completeness ~70%), I think such modifications may not give rise to really different genomes, as it worked well for other simulated species using the same method.

For my understanding, the different genomes may be delineated into an independent cluster by PopCOGenT. Or I may be wrong. Do you think the low completeness or fragmentation may be the cause?

Thanks and regards, Xiaojun

Xiaojun928 avatar Jul 07 '21 14:07 Xiaojun928