Cogent icon indicating copy to clipboard operation
Cogent copied to clipboard

tally_Cogent_contigs_per_family.py error

Open jodithea opened this issue 4 years ago • 0 comments

I am trialling Cogent on the test_human.fa data as shown on the Wiki, though I did not use the reference genome hg38 for the coding genome reconstruction as I do not have a reference genome for my data and wanted to check I could do everything without it.

python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output

The above works well. I used hg38 as a placeholder (I don’t have the reference genome hg38 in my wd on purpose), as I received errors when not including this ref genome placeholder. This worked nicely with the human_output.family_summary.txt:

gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs test_human/1_0 9 4 0 0.00 0.00 False test_human/0_0 10 1 0 0.00 0.00 False

However, when I include blastn in this step:

python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output
--blastn in.fa.blastn

I receive the error: Traceback (most recent call last): File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 62, in main(args.cogent_dir, args.genome, args.output_prefix, args.genome2, args.blastn_filename) File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 41, in main sp.tally_for_a_Cogent_dir(d, writer1, writer2, genome1, genome2, blastn_filename) File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 191, in tally_for_a_Cogent_dir best_of = read_blastn(os.path.join(dirname, blastn_filename), qlen_dict) File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 44, in read_blastn if e < best_of[seqid]: best_of[seqid] = (e, name) TypeError: '<' not supported between instances of 'float' and 'tuple'

And the human_output.family_summary.txt is empty, but does now include the blastn headers:

gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs num_blastn blastn_best

I checked, and the in.fa.blastn file is in each of the cogent directories. Do you have any suggestions of what could be going wrong here?

Thank you!

jodithea avatar Jan 12 '21 01:01 jodithea