orthofiller icon indicating copy to clipboard operation
orthofiller copied to clipboard

proposedGenes file not found

Open MatteoSchiavinato opened this issue 7 years ago • 5 comments

The program seems not able to find a file that it should be internally generated: grep: path/to/file.fa.proposedGenes: No such file or directory

(I substituted the original file with path/to/file)

Checking through the log is quite tedious because of the infinite number of progress report lines so I am not sure I missed the error line. What happened here? Is it not creating this file because it doesn't find any proposed gene for other reasons?

MatteoSchiavinato avatar May 23 '17 14:05 MatteoSchiavinato

Hi Matteo,

The logging problem might be to do with carriage return characters not getting written out properly by bash. To better see the output, try doing implementing

sed -r "s:.*\r::g" log.txt > log_clean.txt

Sorry that this issue has arisen: could you please send me the cleaned up log file and an ls –l of the OrthoFiller working directory and I’ll get to the bottom of it!

All the best,

Michael

From: Matteo Schiavinato [mailto:[email protected]] Sent: 23 May 2017 15:58 To: mpdunne/orthofiller [email protected] Cc: Subscribed [email protected] Subject: [mpdunne/orthofiller] proposedGenes file not found (#7)

The program seems not able to find a file that it should be internally generated: grep: path/to/file.fa.proposedGenes: No such file or directory

(I substituted the original file with path/to/file)

Checking through the log is quite tedious because of the infinite number of progress report lines so I am not sure I missed the error line. What happened here? Is it not creating this file because it doesn't find any proposed gene for other reasons?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/mpdunne/orthofiller/issues/7, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGCT11X1Zdp6-qwfZojIjwZRQVypFHKgks5r8vQTgaJpZM4Nj0vz.

mpdunne avatar May 23 '17 15:05 mpdunne

Thanks for the quick answer!

With your sed command I saw better. It's not processing the orthogroups correctly for some reason (memory related?). Here is an example:

Running HMMs on species file.aa.fa
0 out of 228 orthogroups processed
Fatal exception (source file ../../easel/esl_hmm.c, line 196):
malloc of size -308088 failed

(again file.aa.fa is a dummy file) I am running it on an HPC with 24 cores and 500 GB memory.

MatteoSchiavinato avatar May 23 '17 15:05 MatteoSchiavinato

Hi Matteo,

This looks like an HMMER issue that we’re currently investigating: it can sometimes arise with particularly large genomes. May I ask what is the size of the below genome fasta file, and how many chromosomes/scaffolds it has? And is the issue occurring for any of the other genomes?

The latest version of OrthoFiller attempts to circumvent this issue by searching each chromosome individually. This aims to reduce memory usage at the possible expense of computation time: currently it is implemented as the default but in later versions there will be an option to choose. In your case it looks like you’ll want to use this splitting version.

I’m just double checking this version and once I’ve confirmed things are working as they should I’ll upload it for you to try!

All the best,

Michael

From: Matteo Schiavinato [mailto:[email protected]] Sent: 23 May 2017 16:31 To: mpdunne/orthofiller [email protected] Cc: Michael Dunne [email protected]; Comment [email protected] Subject: Re: [mpdunne/orthofiller] proposedGenes file not found (#7)

Thanks for the quick answer!

With your sed command I saw better. It's not processing the orthogroups correctly for some reason (memory related?). Here is an example:

Running HMMs on species NIATT_r2.0.iso1.aa.fa 0 out of 228 orthogroups processedFatal exception (source file ../../easel/esl_hmm.c, line 196): malloc of size -308088 failed

I am running it on an HPC with 24 cores and 500 GB memory.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mpdunne/orthofiller/issues/7#issuecomment-303436425, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGCT1950zDeWk1gqD1Kd_wfnhNcPrcR6ks5r8vvOgaJpZM4Nj0vz.

mpdunne avatar May 23 '17 15:05 mpdunne

These are the genome sizes!

2.3G genome_1.fa
1.2G genome_2.fa
2.6G genome_3.fa
2.1G genome_4.fa
2.8G genome_5.fa
1.6G genome_6.fa
2.8G genome_7.fa

Thanks for this chance. I'd like to try it with the option you mentioned.

MatteoSchiavinato avatar May 23 '17 16:05 MatteoSchiavinato

Hi Matteo,

The new version OrthoFiller with individual chromosome searching is live on github, I was wondering if you’d had a chance to have a go with it?

All the best,

Michael

From: Michael Dunne Sent: 23 May 2017 16:56 To: 'mpdunne/orthofiller' [email protected]; mpdunne/orthofiller [email protected] Cc: Comment [email protected] Subject: RE: [mpdunne/orthofiller] proposedGenes file not found (#7)

Hi Matteo,

This looks like an HMMER issue that we’re currently investigating: it can sometimes arise with particularly large genomes. May I ask what is the size of the below genome fasta file, and how many chromosomes/scaffolds it has? And is the issue occurring for any of the other genomes?

The latest version of OrthoFiller attempts to circumvent this issue by searching each chromosome individually. This aims to reduce memory usage at the possible expense of computation time: currently it is implemented as the default but in later versions there will be an option to choose. In your case it looks like you’ll want to use this splitting version.

I’m just double checking this version and once I’ve confirmed things are working as they should I’ll upload it for you to try!

All the best,

Michael

From: Matteo Schiavinato [mailto:[email protected]] Sent: 23 May 2017 16:31 To: mpdunne/orthofiller <[email protected]mailto:[email protected]> Cc: Michael Dunne <[email protected]mailto:[email protected]>; Comment <[email protected]mailto:[email protected]> Subject: Re: [mpdunne/orthofiller] proposedGenes file not found (#7)

Thanks for the quick answer!

With your sed command I saw better. It's not processing the orthogroups correctly for some reason (memory related?). Here is an example:

Running HMMs on species NIATT_r2.0.iso1.aa.fa 0 out of 228 orthogroups processedFatal exception (source file ../../easel/esl_hmm.c, line 196): malloc of size -308088 failed

I am running it on an HPC with 24 cores and 500 GB memory.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/mpdunne/orthofiller/issues/7#issuecomment-303436425, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AGCT1950zDeWk1gqD1Kd_wfnhNcPrcR6ks5r8vvOgaJpZM4Nj0vz.

mpdunne avatar Jul 06 '17 09:07 mpdunne