C3POa
C3POa copied to clipboard
Issues with C3POa.py: cannot execute gonk
I'm having issues with the C3POa.py script at the gonk stage. Preprocessing worked out well, but as I run C3POa.py on one of my fastq files, I am getting this error from bash:
Traceback (most recent call last):
File "/Users/ckim/C3POa/C3POa.py", line 663, in
When I check the gonk_messages to see the error, this is the error reported:
sh: /Users/ckim/C3POa/gonk/gonk: cannot execute binary file
I made sure to get the Go dependency for gonk. I did setup from the instruction at the beginning, but haven't been able to get any farther with the script. Any help is appreciated and happy to give any more information as needed. Thanks!
If you try to execute the gonk binary outside of the script, do you get a message telling you to enter sequences?
I do not, I still get the "cannot execute binary file" error
Try deleting the binary and building manually using go build src/gonk
from the base directory
Getting somewhere! That seemed to work out with the gonk issue at least. Now I'm getting a different gonk_messages, but the same bash error as above.
panic: open /Users/ckim/20200213_0143_20200212_ASD_mCh_R2c2test/consensus/: is a directory
goroutine 1 [running]: main.check(...) /Users/ckim/C3POa/gonk/src/gonk.go:32 main.writeScores(0xc00032ef00, 0x404, 0x404) /Users/ckim/C3POa/gonk/src/gonk.go:147 +0x27a main.main() /Users/ckim/C3POa/gonk/src/gonk.go:180 +0x249
The panic makes me think that I'm not assigning the -p flag properly. Not sure if that's the reason for the errors here or something else with any of the inputs going in.
This makes me think that it isn't adding on the filename for the output file correctly. gonk should have been given the path from your C3POa command line arguments (--path
or -p
). Can I see the C3POa command/bash script that you're using to run this?
In any case, I've updated the gonk source code so that if it's given a directory with no filename (what seems to be happening here), it will automatically add the default output filename ("SW_PARSE.txt"). Please run make clean
in your gonk directory and git pull
. Then rebuild using make
. The reason the original binary didn't work for you is because I made the mistake of adding the binary compiled on my computer to the repository. This should hopefully fix the issue.
Hi,
I'm having the same issue. I've tried your step of updating gonk.
Attempt with no -o set
python3 C3POa.py -r /Users/tim/C3POa_preprocessing/1/Splint/R2C2_raw_reads.fastq -p ~/ -m NUC.4.4.mat -l 1000 -d 500 -c example_config -t -f
/Users/tim/C3POa_preprocessing/1/Splint/R2C2_raw_reads.fastq
gonk took 0.05568385124206543 seconds to run.
Traceback (most recent call last):
File "C3POa.py", line 663, in
gonk_messages: panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x10ae785]
goroutine 1 [running]: main.writeScores(0xc00105a000, 0x7ac, 0x7ac) /Users/tim/gonk/src/gonk.go:147 +0x55 main.main() /Users/tim/gonk/src/gonk.go:184 +0x249
I've pushed an update to gonk that fixes the error you've been getting. You should be able to give it a directory or a filename. As for how you're running C3POa.py, I would put this into a wrapper bash script (I do this for consistency and parallelization). The below example is using GNU Parallel.
#!/bin/bash
config=$HOME'/config'
c3poa=$HOME'/C3POa/C3POa.py'
matrix=$HOME'/C3POa/NUC.4.4.mat'
# goes to preprocessing directory
path=/path/to/preprocessed/folders/
# what to name the consensus files
cons_out=example_consensus.fasta
# error file
err=$path/err
# number of threads
jobs=32
parallel -j$jobs python3 $c3poa --reads {0}/R2C2_raw_reads.fastq \
--path {0} \
--matrix $matrix \
--config $config \
--output {0}/$cons_out \
2>$err ::: $path/*/*
The asterisks at the end will depend on if you have multiple splints that you demultiplexed with. So if in each numbered folder you have another folder called Splint
or the like, you would need both asterisks. If you have a file called R2C2_raw_reads.fastq
in the numbered folders, change $path/*/*
to $path/*
. You may need to replace $path
with the actual path.
Hi,
I would like to use C3PO for my data analysis. In my case preprocessing goes OK. However, when I try to run C3POa I get the following error:
python3 C3POa.py -t -r /home/oscar/Desktop/C3POa/splint1/preprocessed_reads.fastq -p /home/oscar/Desktop/C3POa/splint1/temp -m /home/oscar/Desktop/C3POa/NUC.4.4.mat -l 1000 -d 500 -c configf.txt
Using gonk from your path, not the config file.
/home/oscar/Desktop/C3POa/splint1/preprocessed_reads.fastq
gonk took 0.0009248256683349609 seconds to run.
Traceback (most recent call last):
File "C3POa.py", line 708, in
I have also excluded the -o. Does this error means that no consensus sequences were identified?
This is a debugging statement that gets run at the very end of the script. Your consensus sequences should be intact. To squelch the message, update C3POa by pulling from the repo.
After the run, the consensus and subreads files are created but are empty. Additioanlly I get three extra files: seq1.fasta, seq2.fasta and gonk_messages. The two seq files seems to be partial sequences and the gonk_messages files says: sh 1: gonk: not found (I am new to bioinformatics, I apologize if its something simple).
Can you post your config file? Also did you build gonk? This is almost certainly a path issue since everything else is working but gonk isn't running.
Thanks. I have checked the config file, and there was a small mistake in the path. The run is going OK now.
Hi, I am also struggling with the C3POa.py script. I was able to run the preprocessing script without an issue but now I have troubles with executing gonk. The gonk_messages file contains sh: /gonk/gonk: No such file or directory. I'm running the scripts on a mac.
There's a good chance you didn't build the package, or it has the wrong path