RUFUS
RUFUS copied to clipboard
Docker Image
Is there a Docker image available for RUFUS?
Sorry, no we don’t have a docker image yet.
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
On Jan 30, 2020, at 8:36 AM, ksarathbabu [email protected] wrote:
Is there a Docker image available for RUFUS?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Haven't heard back about building from source for two weeks now #18, so I've tried to build a Docker image.
I'm getting errors related to CMakeLists.txt
but not sure how to fix it:
Dockerfile
FROM ubuntu:latest
LABEL \
version="1.0.0" \
description="RUFUS image for use in Workflows"
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && \
apt-get install -y apt-utils \
libz-dev \
zlib1g-dev \
libbz2-dev \
liblzma-dev \
bc \
libncurses5-dev \
git \
build-essential \
g++ \
python \
gcc \
mono-mcs \
wget \
cmake
RUN mkdir -p /opt/tools
WORKDIR /opt/tools
# Download Bedtools 2.27.1
ENV VERSION 1.0
ENV NAME RUFUS
ENV URL "https://github.com/jandrewrfarrell/RUFUS/archive/V${VERSION}.tar.gz"
RUN wget -q -O - $URL | tar -zxv && \
cd ${NAME}-${VERSION} && \
mkdir build && cd build && \
cmake ../ -DCMAKE_C_COMPILER=$(which gcc) -DCMAKE_CXX_COMPILER=$(which g++) && \
make
docker build
sudo docker build -t rufus-v1.0 .
Most of the build seems to go okay, until the bwa
part where I get the following error (truncated for convenience):
Error:
CMake Error: The source directory "/opt/tools/RUFUS-1.0/build/external" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
make[2]: *** [externals/CMakeFiles/rufus_external_project.dir/build.make:111: externals/rufus_external_project-prefix/src/rufus_external_project-stamp/rufus_external_project-configure] Error 1
make[1]: *** [CMakeFiles/Makefile2:169: externals/CMakeFiles/rufus_external_project.dir/all] Error 2
make: *** [Makefile:84: all] Error 2
The command '/bin/sh -c wget -q -O - $URL | tar -zxv && cd ${NAME}-${VERSION} && mkdir build && cd build && cmake ../ -DCMAKE_C_COMPILER=$(which gcc) -DCMAKE_CXX_COMPILER=$(which g++) && make' returned a non-zero code: 2
Any advice would be greatly appreciated.
Seeking outside expertise: https://stackoverflow.com/questions/65948810/cmake-error-the-source-directory-does-not-appear-to-contain-cmakelists-txt-for
Sorry I missed that email, things have been crazy here with Covid stuff. The build definitely needs the internet sorry. We pull allot of the dependencies straight from GitHub when we’re building. I would recommend just building on a similar system with internet connection, then zipping the whole RUFUS dir and moving it to the secure system.
=================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
On Jan 28, 2021, at 9:33 PM, Matthew J. Oldach [email protected] wrote:
Seeking outside expertise: https://stackoverflow.com/questions/65948810/cmake-error-the-source-directory-does-not-appear-to-contain-cmakelists-txt-for
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I would recommend just building on a similar system with internet connection, then zipping the whole RUFUS dir and moving it to the secure system.
Can I use a version of gcc
> 4.9.2
? Or, can it only be built on gcc-4.9.2
?
Greater than 4.9.2 should be fine but I haven’t tested it extensively.
=================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
On Jan 29, 2021, at 5:48 PM, Matthew J. Oldach [email protected] wrote:
I would recommend just building on a similar system with internet connection, then zipping the whole RUFUS dir and moving it to the secure system.
Can I use a version of gcc > 4.9.2? Or, can it only be built on gcc-4.9.2?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I have created a Dockerfile for RUFUS as pull request https://github.com/jandrewrfarrell/RUFUS/pull/20. I had no issues with the build process when using the dependencies outlined in the README.
I have created a Dockerfile for RUFUS as pull request #20. I had no issues with the build process when using the dependencies outlined in the README.
Hi @kohrar I really appreciate the attempt. Could you please provide some instructions for how you would run it (Docker
is fine; I can figure out how to run Singularity
from that)?
I pulled #20, built a container from your Dockerfile
and pushed it to moldach686/rufus-v1.0
.
$ git pull https://github.com/kohrar/RUFUS.git
$ cd RUFUS
$ sudo docker build -t rufus-v1.0 .
$ sudo docker tag rufus-v1.0:latest moldach686/rufus-v1.0:latest
$ sudo docker push moldach686/rufus-v1.0:latest
Next, I build a Singularity
image within the RUFUS
directory:
$ sudo singularity build rufus.sif docker://moldach686/rufus-v1.0:latest
ll
total 356576
drwxrwxr-x 8 mtg mtg 4096 Feb 9 20:51 ./
drwxrwxr-x 12 mtg mtg 4096 Feb 9 21:38 ../
-rw-rw-r-- 1 mtg mtg 614 Feb 9 18:26 CMakeLists.txt
-rw-rw-r-- 1 mtg mtg 757 Feb 9 18:26 Dockerfile
drwxrwxr-x 3 mtg mtg 4096 Feb 9 18:26 externals/
drwxrwxr-x 8 mtg mtg 4096 Feb 9 18:26 .git/
-rw-rw-r-- 1 mtg mtg 126 Feb 9 18:26 .gitignore
-rw-rw-r-- 1 mtg mtg 6289 Feb 9 18:26 README.md
drwxrwxr-x 4 mtg mtg 4096 Feb 9 18:26 resources/
-rwxr-xr-x 1 mtg mtg 364978176 Feb 9 20:51 rufus.sif*
-rwxrwxr-x 1 mtg mtg 37610 Feb 9 18:26 runRufus.sh*
drwxrwxr-x 3 mtg mtg 12288 Feb 9 18:26 scripts/
drwxrwxr-x 5 mtg mtg 4096 Feb 9 18:26 src/
drwxrwxr-x 2 mtg mtg 4096 Feb 9 18:26 testRun/
Then I tar
this folder and transfer it the cluster where I need to use Singularity
$ cd .. && tar czvf rufus.tar.gz RUFUS/
$ scp <command>
Finally, I try running the following command:
[ -d rufus-analysis ] || mkdir rufus-analysis && cd rufus-analysis && \
singularity -s exec \
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_oldPipeScript/alignment/bwa/:/usr/lib/locale/ \
-B /project/M-mtgraovac182840/indexes/GRCh37/:/usr/lib/locale/index/ \
-B /project/M-mtgraovac182840/tools/RUFUS/:/usr/lib/locale/RUFUS/ \
/project/M-mtgraovac182840/tools/RUFUS/rufus.sif /usr/lib/locale/RUFUS/runRufus.sh \
-s /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam \
-t 2 \
--kmersize 25 \
--ref=/usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
However, I'm getting the following error:
/usr/lib/locale/RUFUS/scripts/RunJellyForRUFUS.sh: line 29: /usr/lib/locale/RUFUS/scripts/..//bin/externals/jellyfish/src/jellyfish_project/bin/jellyfish: No such file or directory`)
Hi Matthew,
I've only done a cursory test with RUFUS after building the image and then running it with docker run --rm -ti rufus bash
. I've tried this with Singularity but since the root filesystem is read-only, I had to copy /RUFUS out to a different location or use bind mounts as you've attempted.
Regarding your issue, it looks like you're running RUFUS from a bind mount at /usr/lib/locale/RUFUS
sourced from /project/M-mtgraovac182840/tools/RUFUS/
rather than using the compiled binaries within the container at /RUFUS
.
Did you copy the entire /RUFUS
directory out of the container to this location? Can you verify that jellyfish does indeed exist at your project directory at /project/M-mtgraovac182840/tools/RUFUS/bin/externals/jellyfish/src/jellyfish_project/bin/jellyfish
?
it looks like you're running RUFUS from a bind mount at
/usr/lib/locale/RUFUS
sourced from/project/M-mtgraovac182840/tools/RUFUS/
rather than using the compiled binaries within the container at/RUFUS
So this part of my call was wrong: /project/M-mtgraovac182840/tools/RUFUS/rufus.sif /usr/lib/locale/RUFUS/runRufus.sh \
.
However, when I change it to the following I still get an error:
singularity -s exec \
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_oldPipeScript/alignment/bwa/:/usr/lib/locale/ \
-B /project/M-mtgraovac182840/indexes/GRCh37/:/usr/lib/locale/index/ /project/M-mtgraovac182840/tools/rufus.sif \
./RUFUS/runRufus.sh \
-s /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam \
-t 2 \
--kmersize 25 \
--ref=/usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
FATAL: stat /home/moldach/RUFUS/runRufus.sh: no such file or directory
It's almost like the I cannot see inside the container?
Let's take a look inside the Docker
image:
Can see contents of container in Docker
!
$ sudo docker run --rm -it rufus ls
RUFUS boot etc lib media opt root sbin sys usr
bin dev home lib64 mnt proc run srv tmp var
Cannot see contents of container in Singularity
?
Here it is showing the contents of my $PWD
and not the contents of the container
$ singularity exec rufus.sif ls
Desktop Public mom_bwaMEM_sort_dedupped.bam.generator.Jhash.temp
Documents Templates mom_bwaMEM_sort_dedupped.bam.generator.fq
Downloads Videos proband_bwaMEM_sort_dedupped.bam.generator
Music dad_bwaMEM_sort_dedupped.bam.generator
Pictures mom_bwaMEM_sort_dedupped.bam.generator
One issue is that I need to supply -B $PWD
and prefix the /RUFUS
subdirectory, like so $ singularity exec -B $PWD rufus.sif $PWD/RUFUS/runRufus.sh
.
However, I'm getting a libjellyfish-2.0.so.2
error when I try to run the command:
$ singularity -s exec -B /project/M-mtgraovac182840/matthew/tool-testing/MTG_oldPipeScript/alignment/bwa/:/usr/lib/locale/ -B /project/M-mtgraovac182840/indexes/GRCh37/:/usr/lib/locale/index/ -B $PWD /project/M-mtgraovac182840/tools/rufus.sif $PWD/RUFUS/runRufus.sh -s /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam -c /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam -c /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam -t 2 --kmersize 25 --ref=/usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa checking for samtools
/usr/bin/samtools
samtools found
_arg_fastqA =
_arg_fastqB =
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Final reference path being used is /usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
Final bwa reference path being used is /usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
proband extension is bam
you provided the proband cram file /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam
parent file name is mom_bwaMEM_sort_dedupped.bam
parent file extension name is bam
You provided the control bam file /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam
parent file name is dad_bwaMEM_sort_dedupped.bam
parent file extension name is bam
You provided the control bam file /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam
~~~~~~~~~~~~ printing out paramater values used in script ~~~~~~~~~~~~~~~~
value of ProbandGenerator proband_bwaMEM_sort_dedupped.bam.generator
Value of ParentGenerators:
mom_bwaMEM_sort_dedupped.bam.generator
dad_bwaMEM_sort_dedupped.bam.generator
Value of K is: 25
Value of Threads is: 2
value of ref is: /usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
value of min is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did not provide refHash
$_arg_min is empty
_arg_min is
MutantMinCov is
parent is mom_bwaMEM_sort_dedupped.bam.generator
parent is dad_bwaMEM_sort_dedupped.bam.generator
Running jellyfish for mom_bwaMEM_sort_dedupped.bam.generator
/project/M-mtgraovac182840/tools/RUFUS/bin/externals/jellyfish/src/jellyfish_project/bin/jellyfish: error while loading shared libraries: libjellyfish-2.0.so.2: cannot open shared object file: No such file or directory
Hi @kohrar
I've only done a cursory test with RUFUS after building the image and then running it with
docker run --rm -ti rufus bash
Did you not try and run /testRun/runTest.sh
within the Docker
container to verify that it runs?
I've tried the following but there appears to be issues:
(base) mtg@mtg-ThinkPad-P53:~/DOCKER-CONTAINERS/RUFUS$ sudo docker run --rm -it rufus-v1.0
root@8f459380a65e:/# ls
RUFUS boot etc lib media opt root sbin sys usr
bin dev home lib64 mnt proc run srv tmp var
root@8f459380a65e:/# bash RUFUS/testRun/runTest.sh
RUFUS/testRun/runTest.sh: line 1: ./../runRufus.sh: No such file or directory
you need to be in the directory RUFUS/testRun/ to run the test. Sorry about that, I should make that clearer in the documentation.
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/ http://marthlab.org/=====================================
On Feb 17, 2021, at 10:40 AM, Matthew J. Oldach [email protected] wrote:
Hi @kohrar https://github.com/kohrar I've only done a cursory test with RUFUS after building the image and then running it with docker run --rm -ti rufus bash
Did you not try and run /testRun/runTest.sh with Docker to verify that the Docker contain works?
I've tried the following but there appears to be issues:
(base) mtg@mtg-ThinkPad-P53:~/DOCKER-CONTAINERS/RUFUS$ sudo docker run --rm -it rufus-v1.0 root@8f459380a65e:/# ls RUFUS boot etc lib media opt root sbin sys usr bin dev home lib64 mnt proc run srv tmp var root@8f459380a65e:/# bash RUFUS/testRun/runTest.sh RUFUS/testRun/runTest.sh: line 1: ./../runRufus.sh: No such file or directory — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jandrewrfarrell/RUFUS/issues/15#issuecomment-780727428, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSVWHXUHOKRGRCYJS5C6MDS7P5QHANCNFSM4KNYDGCQ.
Hi @moldach,
I ran the runTest script without issue within the Singularity image. I am not mounting an external version of RUFUS into the image as you are doing, which I think is leading to your issues. See my usage below:
% singularity run -H `pwd` /global/software/singularity/images/software/rufus.sif
Singularity> cd /
Singularity> ls
RUFUS bin boot bulk dev environment etc home lib lib64 media mnt opt proc root run sbin singularity srv sys tmp usr var
## Here, I copy RUFUS out to somewhere where I can write to as / in singularity is read-only.
Singularity> cp -r RUFUS /tmp
Singularity> cd /tmp/RUFUS/testRun/
Singularity> sh runTest.sh
checking for samtools
/usr/bin/samtools
samtools found
...
MutantMinCov is
parent is Mother.bam.generator
parent is Father.bam.generator
Running jellyfish for Mother.bam.generator
...
Regarding your library issue, this is what should be loaded. Everything is within the image and not from some external bind mount.
Singularity> ldd /tmp/RUFUS/bin/externals/jellyfish/src/jellyfish_project/bin/jellyfish
linux-vdso.so.1 => (0x00007fff654f9000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f91c17dd000)
libjellyfish-2.0.so.2 => /RUFUS/bin/externals/jellyfish/src/jellyfish_project/lib/libjellyfish-2.0.so.2 (0x00007f91c15b2000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f91c1230000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f91c0f27000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f91c0d11000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f91c0947000)
/lib64/ld-linux-x86-64.so.2 (0x00007f91c19fa000)
Hello @kohrar thank you very much for providing that detailed explanation and reproducible example!
I am not mounting an external version of RUFUS into the image as you are doing
-H $PWD
turned out to be critical for this.
I'm fairly new to using Singularity and in the past, for example with Manta
(and DeepVariant) I was using -B $PWD
when faced with a function parameter that asked for --runDir
:
singularity exec \
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_human_genomics_pipeline-master/alignment/bwa/:/bams,/project/M-mtgraovac182840/indexes/GRCh37/:/reference \
-B $PWD \
/project/M-mtgraovac182840/tools/manta-1.6.0.img \
/manta/bin/configManta.py \
--bam /bams/proband_bwaMEM_sort_dedupped.bam \
--referenceFasta /reference/Homo_sapiens.GRCh37.dna.toplevel.fa \
--runDir $PWD
So, the fact that RUFUS
doesn't ask for a output directory (and instead prints to the $PWD
) caused the -B $PWD
solution to fail.
As a word of caution for others, running the test on the singularity container failed for me; however - and luckily - it does work on my data 🥳
Working solution
singularity -s exec \
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_oldPipeScript/alignment/bwa/:/usr/lib/locale/ \
-B /project/M-mtgraovac182840/indexes/GRCh37/:/usr/lib/locale/index/ \
-H `pwd` \
/project/M-mtgraovac182840/tools/rufus.sif \
./RUFUS/runRufus.sh \
-s /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam \
-c /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam \
-t 2 \
--kmersize 25 \
--ref=/usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.fa
Error on testRun
## test asks for 40 cores but we will just ask for 30
$ salloc --time=0:30:0 --mem-per-cpu=5000 --cpus-per-task=30
$ singularity run -H `pwd` rufus.sif
Singularity> cd /
Singularity> cp -r RUFUS /tmp
Singularity> cd /tmp/RUFUS/testRun/
Singularity> sh runTest.sh
checking for samtools
/usr/bin/samtools
samtools found
_arg_fastqA =
_arg_fastqB =
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Final reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa
Final bwa reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
proband extension is bam
you provided the proband cram file /tmp/RUFUS/testRun/Child.bam
parent file name is Mother.bam
parent file extension name is bam
You provided the control bam file /tmp/RUFUS/testRun/Mother.bam
parent file name is Father.bam
parent file extension name is bam
You provided the control bam file /tmp/RUFUS/testRun/Father.bam
~~~~~~~~~~~~ printing out paramater values used in script ~~~~~~~~~~~~~~~~
value of ProbandGenerator Child.bam.generator
Value of ParentGenerators:
Mother.bam.generator
Father.bam.generator
Value of K is: 25
Value of Threads is: 40
value of ref is: /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa
value of min is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Did not provide refHash
$_arg_min is empty
_arg_min is
MutantMinCov is
parent is Mother.bam.generator
parent is Father.bam.generator
Running jellyfish for Mother.bam.generator
Running jellyfish for Father.bam.generator
Running jellyfish for Child.bam.generator
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "en_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
min not provided, building model
staring model
Call is histoFile HS ReadLength Threads
Parent File open - Child.bam.generator.Jhash.histo
first line = 0 - 0
getting another
got 1 - 0
getting another
got 2 - 1089
going with 2 - 1089
Number of reads = 3630
I = 0 0
I = 1 1089
I = 2 107
I = 3 31
I = 4 44
I = 5 21
I = 6 27
I = 7 54
I = 8 129
I = 9 98
SC = 25 vlaue = 1135
stdi = 31 stdev = 6
best error is 1/x^3.34726
On 1 pass
best Factor = 0 steps = 12
bestSC = 27.4356 steps = 4
best StdDev = 6.44494 steps = 3
best skew factor = 0 steps = 0
best Power factor = 1 steps = 3
On 2 pass
best Factor = 0 steps = 12
bestSC = 26.9935 steps = 5
best StdDev = 6.36922 steps = 3
best skew factor = 0 steps = 0
best Power factor = 1 steps = 3
On 3 pass
best Factor = 0 steps = 12
bestSC = 27.0027 steps = 5
best StdDev = 6.6012 steps = 3
best skew factor = 0 steps = 0
best Power factor = 1 steps = 3
Best Model is SC = 27.0027 StdDev = 6.6012 F = 0 skew = 0 bestP = 1
GenomeSize = 31411.6
prob not error = 0.0116117
prob not error = 0.136157
prob not error = 0.441105
prob not error = 0.722617
prob not error = 0.871386
prob not error = 0.937632
this one
GenomeSize = 31411.6
Inflection point = 3
Recomended RUFUS cutoff = -6.00331
-1std = 20.4015 -2std = 13.8003 -3std = 7.1991 -4std = 0.597893
done with model
mutant min coverage from generated model is 5
mutant SC coverage from generated model is 25
MaxHashDepth = 125
made it
made it here
starting RUFUS filter
_arg_fastqA =
_arg_fastqB =
running this one
Call is PreBuiltMutHash Mutant.Mate1.fq Mutant.Mate2.fq firstpassfile hashsize MinQ HashCountThreshold threads
VM: 19756; RSS: 2564
Paramaters are:
PreBuiltMutHash = Child.bam.generator.k25_c5.HashList
Mutant.mate1.fq = Child.bam.generator.temp.mate1.fastq
Mutant.mate2.fq = Child.bam.generator.temp.mate2.fastq
out stub = Child.bam.generator
HashSize = 25
MinQ = 13
HashCountThreshold = 1
Threads = 38
Parent File open - Child.bam.generator.k25_c5.HashList
MutFile.mate1 is Child.bam.generator.temp.mate1.fastq
here
##File Opend
MutFile.mate2 is Child.bam.generator.temp.mate2.fastq
##File Opend
Reading in pre-built hash talbe
starting
Reading in MutHashFile
Done Hash Files
Mutations Hash size is 74
I am using 2564
VM: 19756; RSS: 2564; maxVM: 19756; maxRSS: 2564
Starting Search
Read in 2040 lines: Found 20 Reads per sec = 6.16048e-11
Done running RUFUS.Filter.cpp
skipping fastp fix
sort: invalid option -- 'T'
sort: invalid option -- 'O'
samblaster: Version 0.1.26
samblaster: Inputting from stdin
samblaster: Outputting to stdout
open: No such file or directory
[bam_sort_core] fail to open file Child.bam.generator.Mutations.fastq
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 40 sequences (6040 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 20, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (274, 298, 334)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (154, 454)
[M::mem_pestat] mean and std.dev: (297.37, 36.62)
[M::mem_pestat] low and high boundaries for proper pairs: (94, 514)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 40 reads in 0.009 CPU sec, 0.004 real sec
samblaster: Loaded 2 header sequence entries.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "Child.bam.generator.Mutations.fastq.bam".
ERROR: BWA failed on Child.bam.generator.Mutations.fastq. Either the files are exactly the same of something went wrong in previous step
## Not empty files, just really small....
Singularity> ls -sh
total 11M
368K Child.bam 0 Child.bam.generator.temp.mate2.fastq
4.0K Child.bam.generator 272K Father.bam
4.0K Child.bam.generator.Jelly.chr 4.0K Father.bam.generator
200K Child.bam.generator.Jhash 4.0K Father.bam.generator.Jelly.chr
68K Child.bam.generator.Jhash.histo 188K Father.bam.generator.Jhash
9.1M Child.bam.generator.Jhash.histo.7.7.dist 68K Father.bam.generator.Jhash.histo
12K Child.bam.generator.Jhash.histo.7.7.model 364K Mother.bam
0 Child.bam.generator.Jhash.histo.7.7boom.prob 4.0K Mother.bam.generator
8.0K Child.bam.generator.Mutations.Mate1.fastq 4.0K Mother.bam.generator.Jelly.chr
8.0K Child.bam.generator.Mutations.Mate2.fastq 192K Mother.bam.generator.Jhash
0 Child.bam.generator.Mutations.fastq.bam 68K Mother.bam.generator.Jhash.histo
4.0K Child.bam.generator.filter.chr 4.0K clean.sh
4.0K Child.bam.generator.k25_c5.HashList 4.0K mer_counts_merged.jf
0 Child.bam.generator.temp 4.0K runDevTest.sh
0 Child.bam.generator.temp.mate1.fastq 4.0K runTest.sh
Singularity> exit
Thanks again for all the help!
I see a couple of errors going on you’ll want to fix. There seems to be an error with your perl install, Ive never seen these errors and not sure what could be causing them. But RUFUS also appears to go right past them and work just fine, the down street steps seem to be ok. The real problem thats killing your run is you seem to have an issues with samtools. RUFUS checks that samtools is there and it appears be to be, but when calling samtools sort your getting an error “sort: invalid option —’T’” and “sort: invalid options ‘O’”. Ive seen this happen when using a very old version of samtools or when you don’t have samtools installed. Check and make sure when you type “samtools” on the command line that it runs and check the version that you have installed.
Perl error: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C").
=====================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
On Feb 18, 2021, at 8:58 AM, Matthew J. Oldach [email protected] wrote:
Hello @kohrar thank you very much for providing that detailed explanation and reproducible example!
I am not mounting an external version of RUFUS into the image as you are doing -H $PWD turned out to be critical for this.
I'm fairly new to using Singularity and in the past, for example with Manta (and DeepVariant) I was using -B $PWD when faced with a function parameter that asked for --runDir:
singularity exec
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_human_genomics_pipeline-master/alignment/bwa/:/bams,/project/M-mtgraovac182840/indexes/GRCh37/:/reference
-B $PWD
/project/M-mtgraovac182840/tools/manta-1.6.0.img
/manta/bin/configManta.py
--bam /bams/proband_bwaMEM_sort_dedupped.bam
--referenceFasta /reference/Homo_sapiens.GRCh37.dna.toplevel.fa
--runDir $PWDSo, the fact that RUFUS doesn't ask for a output directory (and instead prints to the $PWD) caused the -B $PWD solution to fail.
As a word of caution for others, running the test on the singularity container failed for me; however - and luckily - it does work on my data 🥳
Working solution
singularity -s exec
-B /project/M-mtgraovac182840/matthew/tool-testing/MTG_oldPipeScript/alignment/bwa/:/usr/lib/locale/
-B /project/M-mtgraovac182840/indexes/GRCh37/:/usr/lib/locale/index/
-Hpwd
/project/M-mtgraovac182840/tools/rufus.sif
./RUFUS/runRufus.sh
-s /usr/lib/locale/proband_bwaMEM_sort_dedupped.bam
-c /usr/lib/locale/mom_bwaMEM_sort_dedupped.bam
-c /usr/lib/locale/dad_bwaMEM_sort_dedupped.bam
-t 2
--kmersize 25
--ref=/usr/lib/locale/index/Homo_sapiens.GRCh37.dna.toplevel.faError on testRun
test asks for 40 cores but we will just ask for 30
$ salloc --time=0:30:0 --mem-per-cpu=5000 --cpus-per-task=30 $ singularity run -H
pwd
rufus.sif Singularity> cd / Singularity> cp -r RUFUS /tmp Singularity> cd /tmp/RUFUS/testRun/ Singularity> sh runTest.sh checking for samtools /usr/bin/samtools samtools found _arg_fastqA = _arg_fastqB = @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Final reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa Final bwa reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ proband extension is bam you provided the proband cram file /tmp/RUFUS/testRun/Child.bam parent file name is Mother.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Mother.bam parent file name is Father.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Father.bamvalue of ProbandGenerator Child.bam.generator Value of ParentGenerators: Mother.bam.generator Father.bam.generator Value of K is: 25 Value of Threads is: 40 value of ref is: /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa value of min is:
Did not provide refHash $_arg_min is empty _arg_min is MutantMinCov is parent is Mother.bam.generator parent is Father.bam.generator Running jellyfish for Mother.bam.generator Running jellyfish for Father.bam.generator Running jellyfish for Child.bam.generator perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LANG = "en_CA.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). min not provided, building model staring model Call is histoFile HS ReadLength Threads Parent File open - Child.bam.generator.Jhash.histo first line = 0 - 0 getting another got 1 - 0 getting another got 2 - 1089 going with 2 - 1089 Number of reads = 3630 I = 0 0 I = 1 1089 I = 2 107 I = 3 31 I = 4 44 I = 5 21 I = 6 27 I = 7 54 I = 8 129 I = 9 98 SC = 25 vlaue = 1135 stdi = 31 stdev = 6 best error is 1/x^3.34726 On 1 pass best Factor = 0 steps = 12 bestSC = 27.4356 steps = 4 best StdDev = 6.44494 steps = 3 best skew factor = 0 steps = 0 best Power factor = 1 steps = 3 On 2 pass best Factor = 0 steps = 12 bestSC = 26.9935 steps = 5 best StdDev = 6.36922 steps = 3 best skew factor = 0 steps = 0 best Power factor = 1 steps = 3 On 3 pass best Factor = 0 steps = 12 bestSC = 27.0027 steps = 5 best StdDev = 6.6012 steps = 3 best skew factor = 0 steps = 0 best Power factor = 1 steps = 3 Best Model is SC = 27.0027 StdDev = 6.6012 F = 0 skew = 0 bestP = 1 GenomeSize = 31411.6 prob not error = 0.0116117 prob not error = 0.136157 prob not error = 0.441105 prob not error = 0.722617 prob not error = 0.871386 prob not error = 0.937632 this one GenomeSize = 31411.6 Inflection point = 3 Recomended RUFUS cutoff = -6.00331 -1std = 20.4015 -2std = 13.8003 -3std = 7.1991 -4std = 0.597893 done with model mutant min coverage from generated model is 5 mutant SC coverage from generated model is 25 MaxHashDepth = 125 made it made it here starting RUFUS filter _arg_fastqA = _arg_fastqB = running this one Call is PreBuiltMutHash Mutant.Mate1.fq Mutant.Mate2.fq firstpassfile hashsize MinQ HashCountThreshold threads VM: 19756; RSS: 2564 Paramaters are: PreBuiltMutHash = Child.bam.generator.k25_c5.HashList Mutant.mate1.fq = Child.bam.generator.temp.mate1.fastq Mutant.mate2.fq = Child.bam.generator.temp.mate2.fastq out stub = Child.bam.generator HashSize = 25 MinQ = 13 HashCountThreshold = 1 Threads = 38 Parent File open - Child.bam.generator.k25_c5.HashList MutFile.mate1 is Child.bam.generator.temp.mate1.fastq here ##File Opend MutFile.mate2 is Child.bam.generator.temp.mate2.fastq ##File Opend Reading in pre-built hash talbe starting Reading in MutHashFile Done Hash Files Mutations Hash size is 74 I am using 2564 VM: 19756; RSS: 2564; maxVM: 19756; maxRSS: 2564 Starting Search Read in 2040 lines: Found 20 Reads per sec = 6.16048e-11 Done running RUFUS.Filter.cpp skipping fastp fix sort: invalid option -- 'T' sort: invalid option -- 'O' samblaster: Version 0.1.26 samblaster: Inputting from stdin samblaster: Outputting to stdout open: No such file or directory [bam_sort_core] fail to open file Child.bam.generator.Mutations.fastq [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 40 sequences (6040 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 20, 0, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (274, 298, 334) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (154, 454) [M::mem_pestat] mean and std.dev: (297.37, 36.62) [M::mem_pestat] low and high boundaries for proper pairs: (94, 514) [M::mem_pestat] skip orientation RF as there are not enough pairs [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_process_seqs] Processed 40 reads in 0.009 CPU sec, 0.004 real sec samblaster: Loaded 2 header sequence entries. [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file. [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "Child.bam.generator.Mutations.fastq.bam". ERROR: BWA failed on Child.bam.generator.Mutations.fastq. Either the files are exactly the same of something went wrong in previous step
Not empty files, just really small....
Singularity> ls -sh total 11M 368K Child.bam 0 Child.bam.generator.temp.mate2.fastq 4.0K Child.bam.generator 272K Father.bam 4.0K Child.bam.generator.Jelly.chr 4.0K Father.bam.generator 200K Child.bam.generator.Jhash 4.0K Father.bam.generator.Jelly.chr 68K Child.bam.generator.Jhash.histo 188K Father.bam.generator.Jhash 9.1M Child.bam.generator.Jhash.histo.7.7.dist 68K Father.bam.generator.Jhash.histo 12K Child.bam.generator.Jhash.histo.7.7.model 364K Mother.bam 0 Child.bam.generator.Jhash.histo.7.7boom.prob 4.0K Mother.bam.generator 8.0K Child.bam.generator.Mutations.Mate1.fastq 4.0K Mother.bam.generator.Jelly.chr 8.0K Child.bam.generator.Mutations.Mate2.fastq 192K Mother.bam.generator.Jhash 0 Child.bam.generator.Mutations.fastq.bam 68K Mother.bam.generator.Jhash.histo 4.0K Child.bam.generator.filter.chr 4.0K clean.sh 4.0K Child.bam.generator.k25_c5.HashList 4.0K mer_counts_merged.jf 0 Child.bam.generator.temp 4.0K runDevTest.sh 0 Child.bam.generator.temp.mate1.fastq 4.0K runTest.sh Singularity> exit
Thanks again for all the help!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
So the job ran for 24 hours but failed: Slurm Job_id=2711 Name=rufus_test Ended, Run time 1-00:08:45, COMPLETED, ExitCode 0
sort: invalid option -- 'T'
sort: invalid option -- 'O'
open: No such file or directory
[bam_sort_core] fail to open file proband_bwaMEM_sort_dedupped.bam.generator.Mutations.fastq
samblaster: Version 0.1.26
samblaster: Inputting from stdin
samblaster: Outputting to stdout
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2000000 sequences (300000000 bp)...
[M::process] read 2000000 sequences (300000000 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (22, 416954, 29, 16)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (411, 973, 1348)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 3222)
[M::mem_pestat] mean and std.dev: (830.10, 447.09)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 4159)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (397, 461, 534)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (123, 808)
[M::mem_pestat] mean and std.dev: (467.16, 106.23)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 945)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (405, 1253, 1870)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 4800)
[M::mem_pestat] mean and std.dev: (1221.37, 1039.51)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 6265)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (534, 950, 1892)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 4608)
[M::mem_pestat] mean and std.dev: (1187.62, 956.20)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 5966)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 2000000 reads in 1126.856 CPU sec, 41.206 real sec
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "proband_bwaMEM_sort_dedupped.bam.generator.Mutations.fastq.bam".
There seems to be an error with your
perl
install, ... The real problem thats killing your run is you seem to have an issues withsamtools
.
What isn't clear to me is which perl
& samtools
versions is it trying to use - is it tools from within the RUFUS
container or is trying to use versions I installed on my system?
Let's check:
$ singularity run \
-H `pwd` \
/project/M-mtgraovac182840/tools/rufus.sif perl --version
This is perl 5, version 22, subversion 1 (v5.22.1) built for x86_64-linux-gnu-thread-multi
$ singularity run \
-H `pwd` \
/project/M-mtgraovac182840/tools/rufus.sif samtools
Program: samtools (Tools for alignments in the SAM format)
Version: 0.1.19-96b5f2294a
If I look for the paths of samtools
and perl
on my local host I see a different version:
$ samtools
Program: samtools (Tools for alignments in the SAM format)
Version: 1.3.1 (using htslib 1.3.1)
(base) [moldach@marc RUFUS-TEST]$ perl -v
This is perl 5, version 32, subversion 0 (v5.32.0) built for x86_64-linux
So it's clear that this is an issue with the tools inside the Dockerfile
that @kohrar created:
#
# A Dockerfile to get RUFUS running
#
# GCC 4.9 only available up to 16.04
FROM ubuntu:16.04
ARG DEBIAN_FRONTEND=noninteractive
COPY . /RUFUS
RUN set -ex; \
# Dependencies
BUILD_DEPS="cmake build-essential g++-4.9 zlib1g-dev libbz2-dev libbz2-dev liblzma-dev libncurses5-dev"; \
apt-get update; \
apt-get install -y software-properties-common; \
add-apt-repository ppa:ubuntu-toolchain-r/test; \
apt-get install -y python wget git bc $BUILD_DEPS; \
# Build
mkdir -p /RUFUS/bin; \
cd /RUFUS/bin; \
cmake ../ -DCMAKE_C_COMPILER=$(which gcc) -DCMAKE_CXX_COMPILER=$(which g++); \
make; \
# Cleanup
apt-get purge -y --auto-remove $BUILD_DEPS; \
apt-get clean; \
echo done
# Runtime tools
RUN set -ex; \
apt install samtools; \
echo done
I'm surprised that sudo apt install samtools
is installing such an old version?
- Which version of
samtools
will work? - could the fact that the base image is
Ubuntu 16.04
be why its installing an older version? How to specify whichsamtools
for the container instead?
Yup, your Samtools isn’t working properly, you either don’t have it installed or you have a very old version. I’ve seen that invalid option T and O before when Sam tools isn’t installed. Rufus checks for samtools and your system passes but for some reason samtools sort is still failing. Check that samtools runs when you type “samtools”, check the version of samtools, and check when you type “samtools sort -h” that the O and T options are there.
=================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
On Feb 19, 2021, at 9:26 AM, Matthew J. Oldach [email protected] wrote:
So the job ran for 24 hours but failed: Slurm Job_id=2711 Name=rufus_test Ended, Run time 1-00:08:45, COMPLETED, ExitCode 0
sort: invalid option -- 'T' sort: invalid option -- 'O' open: No such file or directory [bam_sort_core] fail to open file proband_bwaMEM_sort_dedupped.bam.generator.Mutations.fastq samblaster: Version 0.1.26 samblaster: Inputting from stdin samblaster: Outputting to stdout [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 2000000 sequences (300000000 bp)... [M::process] read 2000000 sequences (300000000 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (22, 416954, 29, 16) [M::mem_pestat] analyzing insert size distribution for orientation FF... [M::mem_pestat] (25, 50, 75) percentile: (411, 973, 1348) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 3222) [M::mem_pestat] mean and std.dev: (830.10, 447.09) [M::mem_pestat] low and high boundaries for proper pairs: (1, 4159) [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (397, 461, 534) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (123, 808) [M::mem_pestat] mean and std.dev: (467.16, 106.23) [M::mem_pestat] low and high boundaries for proper pairs: (1, 945) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (405, 1253, 1870) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 4800) [M::mem_pestat] mean and std.dev: (1221.37, 1039.51) [M::mem_pestat] low and high boundaries for proper pairs: (1, 6265) [M::mem_pestat] analyzing insert size distribution for orientation RR... [M::mem_pestat] (25, 50, 75) percentile: (534, 950, 1892) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 4608) [M::mem_pestat] mean and std.dev: (1187.62, 956.20) [M::mem_pestat] low and high boundaries for proper pairs: (1, 5966) [M::mem_pestat] skip orientation FF [M::mem_pestat] skip orientation RF [M::mem_pestat] skip orientation RR [M::mem_process_seqs] Processed 2000000 reads in 1126.856 CPU sec, 41.206 real sec [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file. [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [main_samview] fail to read the header from "proband_bwaMEM_sort_dedupped.bam.generator.Mutations.fastq.bam". There seems to be an error with your perl install, ... The real problem thats killing your run is you seem to have an issues with samtools.
What isn't clear to me is which perl & samtools versions is it trying to use - is it tools from within the RUFUS container or is trying to use versions I installed on my system?
Let's check:
$ singularity run
-Hpwd
/project/M-mtgraovac182840/tools/rufus.sif perl --version This is perl 5, version 22, subversion 1 (v5.22.1) built for x86_64-linux-gnu-thread-multi $ singularity run
-Hpwd
/project/M-mtgraovac182840/tools/rufus.sif samtoolsProgram: samtools (Tools for alignments in the SAM format) Version: 0.1.19-96b5f2294a If I look for the paths of samtools and perl on my local host I see a different version:
$ samtools Program: samtools (Tools for alignments in the SAM format) Version: 1.3.1 (using htslib 1.3.1) (base) [moldach@marc RUFUS-TEST]$ perl -v This is perl 5, version 32, subversion 0 (v5.32.0) built for x86_64-linux So it's clear that this is an issue with the tools inside the Dockerfile that @kohrar created:
A Dockerfile to get RUFUS running
GCC 4.9 only available up to 16.04
FROM ubuntu:16.04
ARG DEBIAN_FRONTEND=noninteractive
COPY . /RUFUS
RUN set -ex; \
Dependencies
BUILD_DEPS="cmake build-essential g++-4.9 zlib1g-dev libbz2-dev libbz2-dev liblzma-dev libncurses5-dev";
apt-get update;
apt-get install -y software-properties-common;
add-apt-repository ppa:ubuntu-toolchain-r/test;
apt-get install -y python wget git bc $BUILD_DEPS; \Build
mkdir -p /RUFUS/bin;
cd /RUFUS/bin;
cmake ../ -DCMAKE_C_COMPILER=$(which gcc) -DCMAKE_CXX_COMPILER=$(which g++);
make; \Cleanup
apt-get purge -y --auto-remove $BUILD_DEPS;
apt-get clean;
echo doneRuntime tools
RUN set -ex;
apt install samtools;
echo done I'm surprised that sudo apt install samtools is installing such an old version?Which version of samtools will work? could the fact that the base image is Ubuntu 16.04 be why its installing an older version? How to specify which samtools for the container instead? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
your Samtools isn’t working properly, you either don’t have it installed or you have a very old version
Correct, as I showed in my above post the following command in @kohrar's Dockerfile
was installing Samtools
version Version: 0.1.19-96b5f2294a
inside the container.
I tried to make changes to the Dockerfile
so it would download a newer version of Samtools-v1.3.1
:
# A Dockerfile to get RUFUS running
#
# GCC 4.9 only available up to 16.04
FROM ubuntu:16.04
ARG DEBIAN_FRONTEND=noninteractive
COPY . /RUFUS
RUN set -ex; \
# Dependencies
BUILD_DEPS="cmake build-essential g++-4.9 zlib1g-dev libbz2-dev liblzma-dev libncurses5-dev libcurl4-gnutls-dev libssl-dev libgcc-5-dev libgomp1"; \
apt-get update; \
apt-get install -y software-properties-common; \
add-apt-repository ppa:ubuntu-toolchain-r/test; \
apt-get install -y python wget git bc $BUILD_DEPS; \
wget https://github.com/samtools/samtools/releases/download/1.3.1/samtools-1.3.1.tar.bz2 && \
tar -xjvf samtools-1.3.1.tar.bz2 && \
cd samtools-1.3.1 && \
make -j 4; \
make prefix=/usr/local/bin install; \
# if you have old version such as 0.x from samtools, you may remove it and create a link to new version
apt remove samtools; \
ln -s /usr/local/bin/bin/samtools /usr/bin/samtools; \
# Build
mkdir -p /RUFUS/bin; \
cd /RUFUS/bin; \
cmake ../ -DCMAKE_C_COMPILER=$(which gcc) -DCMAKE_CXX_COMPILER=$(which g++); \
make; \
# Cleanup
apt-get purge -y --auto-remove $BUILD_DEPS; \
apt-get clean; \
echo done
And, despite the fact that I included libgomp1
in the BUILD_DEPS
I'm getting an error when trying to run it:
/RUFUS/bin/ModelDist: error while loading shared libraries: libgomp.so.1: cannot open shared object file: No such file or directory
Not sure if there is a more logical/timely way of testing for the presence of libgomp.so.1
because I only get this error by running RUFUS for 4 hours - a considerable bummer to trouble-shoot
https://pubmed.ncbi.nlm.nih.gov/23165927/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5903934/
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
USTAR Center for Genetic Discovery
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: [email protected]
http://marthlab.org/
Any update on this @jandrewrfarrell ? (@kohrar)
We urgently need this working.
Thanks
Hi @moldach,
I have updated the Dockerfile under my pull request (https://github.com/jandrewrfarrell/RUFUS/pull/20) to include the missing dependency required by some RUFUS binaries as well as the newest version of Samtools.
This should help with some of the issues you reported. Could you please see if this gets you any further?
RUFUS % singularity run -H `pwd` /global/software/singularity/images/software/rufus.sif
Singularity> cd /RUFUS/bin
Singularity> ldd ModelDist
linux-vdso.so.1 => (0x00007fff6691c000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7947fb4000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7947cab000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f7947a89000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7947873000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f79474a9000)
/lib64/ld-linux-x86-64.so.2 (0x00007f7948336000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f79472a5000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7947088000)
Singularity> samtools --version
samtools 1.11
Using htslib 1.11
Copyright (C) 2020 Genome Research Ltd.
I have been experimenting with getting RUFUS operational in some form. I ran into build errors trying to build the source, so decided to use @moldach's docker image, moldach686/rufus-v1.0. Running in Singularity, I get the following error running the test script:
Singularity> sh runTest.sh checking for samtools /usr/local/bin/samtools samtools found _arg_fastqA = _arg_fastqB = @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Final reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa Final bwa reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ proband extension is bam you provided the proband cram file /tmp/RUFUS/testRun/Child.bam parent file name is Mother.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Mother.bam parent file name is Father.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Father.bam
value of ProbandGenerator Child.bam.generator
Value of ParentGenerators:
Mother.bam.generator
Father.bam.generator
Value of K is: 25
Value of Threads is: 40
value of ref is: /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa
value of min is:
Did not provide refHash $_arg_min is empty _arg_min is MutantMinCov is parent is Mother.bam.generator parent is Father.bam.generator Running jellyfish for Mother.bam.generator /tmp/RUFUS/scripts/RunJellyForRUFUS.sh: line 34: 62244 Killed $JELLYFISH count --disk -m $K -L $L -s 8G -t $T -o $GEN.Jhash -C $GEN.fq
I also tried running this on our own data and got different errors depending on which shell I used: with sh: Singularity> sh ./runRufus.sh -s /scratch.global/lee04110/data/bams/Affected.bam -c /scratch.global/lee04110/data/bams/Mother.bam -c /scratch.global/lee04110/data/bams/Father.bam -c /scratch.global/lee04110/data/bams/Sister.bam -t 8 -k 25 -ref /scratch.global/lee04110/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa checking for samtools /usr/local/bin/samtools samtools found ./runRufus.sh: 29: ./runRufus.sh: Bad substitution ./runRufus.sh: 50: ./runRufus.sh: Syntax error: "(" unexpected
with bash: Singularity> bash ./runRufus.sh -s /scratch.global/lee04110/data/bams/Affected.bam -c /scratch.global/lee04110/data/bams/Mother.bam -c /scratch.global/lee04110/data/bams/Father.bam -c /scratch.global/lee04110/data/bams/Sister.bam -t 8 -k 25 -ref /scratch.global/lee04110/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa checking for samtools /usr/local/bin/samtools samtools found _arg_fastqA = _arg_fastqB = Reference file not built for BWA this program requires the existence of the file ef.sa Killing run with non-zero status Killed
There is a properly generated .sa file in the same directory as the .fa file; I even tried renaming it "ef.sa" to no avail. Anyone have any ideas as to what is causing these errors? Is RUFUS still being actively maintained? Our lab would really like to be able to use it.
The first error your getting is from jellyfish "/tmp/RUFUS/scripts/RunJellyForRUFUS.sh: line 34: 62244 Killed $JELLYFISH count --disk -m $K -L $L -s 8G -t $T -o $GEN.Jhash -C $GEN.fq”. I believe this is due to jellyfish running out of memory, how much memory does your singulatiry instance have available? By default I have RUFUS setup to use at least 32G but in the scripts you could edit jellyfish to use less ram if you need to.
The “bad substitution error” from your second run looks like it's like an issues with different versions of bash compatibility. That line is: RDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )” which sets RDIR to the path were runRufus.sh lives. As a workaround you could just hardcode RDIR as the path to the Rufus directory to see if it works.
=================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: @.***
http://marthlab.org/
On Oct 18, 2022, at 12:11 PM, Antares @.***> wrote:
I have been experimenting with getting RUFUS operational in some form. I ran into build errors trying to build the source, so decided to use @moldach https://github.com/moldach's docker image, moldach686/rufus-v1.0. Running in Singularity, I get the following error running the test script:
Singularity> sh runTest.sh checking for samtools /usr/local/bin/samtools samtools found _arg_fastqA = _arg_fastqB = @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Final reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa Final bwa reference path being used is /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ proband extension is bam you provided the proband cram file /tmp/RUFUS/testRun/Child.bam parent file name is Mother.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Mother.bam parent file name is Father.bam parent file extension name is bam You provided the control bam file /tmp/RUFUS/testRun/Father.bam
value of ProbandGenerator Child.bam.generator Value of ParentGenerators: Mother.bam.generator Father.bam.generator Value of K is: 25 Value of Threads is: 40 value of ref is: /tmp/RUFUS/testRun/../resources/references/small_test_human_reference_v37_decoys.fa value of min is: Did not provide refHash $_arg_min is empty _arg_min is MutantMinCov is parent is Mother.bam.generator parent is Father.bam.generator Running jellyfish for Mother.bam.generator /tmp/RUFUS/scripts/RunJellyForRUFUS.sh: line 34: 62244 Killed $JELLYFISH count --disk -m $K -L $L -s 8G -t $T -o $GEN.Jhash -C $GEN.fq
I also tried running this on our own data and got different errors depending on which shell I used: with sh: Singularity> sh ./runRufus.sh -s /scratch.global/lee04110/data/bams/Affected.bam -c /scratch.global/lee04110/data/bams/Mother.bam -c /scratch.global/lee04110/data/bams/Father.bam -c /scratch.global/lee04110/data/bams/Sister.bam -t 8 -k 25 -ref /scratch.global/lee04110/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa checking for samtools /usr/local/bin/samtools samtools found ./runRufus.sh: 29: ./runRufus.sh: Bad substitution ./runRufus.sh: 50: ./runRufus.sh: Syntax error: "(" unexpected
with bash: Singularity> bash ./runRufus.sh -s /scratch.global/lee04110/data/bams/Affected.bam -c /scratch.global/lee04110/data/bams/Mother.bam -c /scratch.global/lee04110/data/bams/Father.bam -c /scratch.global/lee04110/data/bams/Sister.bam -t 8 -k 25 -ref /scratch.global/lee04110/ref/GRCh38_full_analysis_set_plus_decoy_hla.fa checking for samtools /usr/local/bin/samtools samtools found _arg_fastqA = _arg_fastqB = Reference file not built for BWA this program requires the existence of the file ef.sa Killing run with non-zero status Killed
There is a properly generated .sa file in the same directory as the .fa file; I even tried renaming it "ef.sa" to no avail. Anyone have any ideas as to what is causing these errors? Is RUFUS still being actively maintained? Our lab would really like to be able to use it.
— Reply to this email directly, view it on GitHub https://github.com/jandrewrfarrell/RUFUS/issues/15#issuecomment-1282812416, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSVWHQJS7SSSDUG5JLMZILWD3R6HANCNFSM4KNYDGCQ. You are receiving this because you were mentioned.
Thanks very much for your quick response. I was just running the test script from Singularity on the command line and I have no idea how much memory it had available, but I can run it via a slurm job and specify enough memory. I'll also try the RDIR workaround.
Since I am using the docker image, there seems to be no way work around the RDIR problem. I can't edit the script.
That’s unfortunate. Since it’s a docker image, I assume that path never changes, you could ask them to change that line to just be whatever the RUFUS directory path is since it won’t ever change. Or you could do a Perl one liner at the beginning of your script that find and replaces that line with your path each run. =================================Andrew Farrell PhD Director of Research and Science Department of Human Genetics USTAR Center for Genetic Discovery Eccles Institute of Human Genetics University of Utah School of Medicine15 North 2030 East, Room 7140 Salt Lake City, UT 84112-5330 @.://marthlab.org/=================================On Oct 18, 2022, at 5:32 PM, Antares @.> wrote: Since I am using the docker image, there seems to be no way work around the RDIR problem. I can't edit the script.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
The issue is that the script itself executes the problematic line RDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )” and so fails, regardless of anything I can set in the environment or pass to Singularity.
When trying to build RUFUS from freshly checked out source, I get the following: bwt_gen.c: In function ‘BWTIncMergeBwt’: bwt_gen.c:953:15: warning: variable ‘bitsInWordMinusBitPerChar’ set but not used [-Wunused-but-set-variable] unsigned int bitsInWordMinusBitPerChar; ^~~~~~~~~~~~~~~~~~~~~~~~~ [ 40%] No install step for 'bwa_project' [ 41%] Completed 'bwa_project' [ 41%] Built target bwa_project Scanning dependencies of target fastp_project [ 42%] Creating directories for 'fastp_project' [ 43%] Performing download step (git clone) for 'fastp_project' Cloning into 'fastp_project'... Already on 'master' [ 44%] No patch step for 'fastp_project' [ 45%] No update step for 'fastp_project' [ 45%] No configure step for 'fastp_project' [ 46%] Performing build step for 'fastp_project' /usr/bin/ld: cannot find -lisal /usr/bin/ld: cannot find -ldeflate collect2: error: ld returned 1 exit status make[3]: *** [fastp] Error 1 make[2]: *** [externals/fastp/src/fastp_project-stamp/fastp_project-build] Error 2 make[1]: *** [externals/CMakeFiles/fastp_project.dir/all] Error 2 make: *** [all] Error 2
The error in your build is due to a 3rd party program FASTP which I actually don't use anymore and should remove from the make, thanks for the reminder. For a quick fix, in the file RUFUS/externals/CMakeLists.txt comment out or remove lines 21 and 22.
include(fastp.cmake) LIST(APPEND RUFUS_DEPENDENCIES ${FASTP_PROJECT})”
then do a fresh build and lets see if that helps.
for my first fix I was suggesting you pass a one liner to singularity when you start your program that will edit that scrip and remove that line before it runs such as
perl -ni -e 's/RDIR="$(\ cd\ "$(\ dirname\ "${BASH_SOURCE[0]}"\ )"\ >/dev/null\ 2>&1\ &&\ pwd\ )"/RDIR="myRUFUSpath"/;print’ runRufus.sh
however to be honest I haven't used singularity so have no idea if you can do that. On that note we have been given a one year grant to improve RUFUS and make it more portable so in the next year we will be working on cleaning up RUFUS and creating our own docker/singularity images.
=====================================
Andrew Farrell PhD
Director of Research and Science
Department of Human Genetics
Eccles Institute of Human Genetics
University of Utah School of Medicine
15 North 2030 East, Room 7140
Salt Lake City, UT 84112-5330
Email: @.***
http://marthlab.org/
On Oct 19, 2022, at 11:17 AM, Antares @.***> wrote:
The issue is that the script itself executes the problematic line RDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )” and so fails, regardless of anything I can set in the environment or pass to Singularity.
When trying to build RUFUS from freshly checked out source, I get the following: bwt_gen.c: In function ‘BWTIncMergeBwt’: bwt_gen.c:953:15: warning: variable ‘bitsInWordMinusBitPerChar’ set but not used [-Wunused-but-set-variable] unsigned int bitsInWordMinusBitPerChar; ^~~~~~~~~~~~~~~~~~~~~~~~~ [ 40%] No install step for 'bwa_project' [ 41%] Completed 'bwa_project' [ 41%] Built target bwa_project Scanning dependencies of target fastp_project [ 42%] Creating directories for 'fastp_project' [ 43%] Performing download step (git clone) for 'fastp_project' Cloning into 'fastp_project'... Already on 'master' [ 44%] No patch step for 'fastp_project' [ 45%] No update step for 'fastp_project' [ 45%] No configure step for 'fastp_project' [ 46%] Performing build step for 'fastp_project' /usr/bin/ld: cannot find -lisal /usr/bin/ld: cannot find -ldeflate collect2: error: ld returned 1 exit status make[3]: *** [fastp] Error 1 make[2]: *** [externals/fastp/src/fastp_project-stamp/fastp_project-build] Error 2 make[1]: *** [externals/CMakeFiles/fastp_project.dir/all] Error 2 make: *** [all] Error 2
— Reply to this email directly, view it on GitHub https://github.com/jandrewrfarrell/RUFUS/issues/15#issuecomment-1284334000, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSVWHRQE2XZKZSFXEZDSWLWEAUK3ANCNFSM4KNYDGCQ. You are receiving this because you were mentioned.
RUFUS build successful! Thanks again for responding so quickly. Glad to hear that you've gotten funding for RUFUS and that a docker image is planned.