sandbox.bio icon indicating copy to clipboard operation
sandbox.bio copied to clipboard

Combine `.bin` files into one

Open robertaboukhalil opened this issue 2 years ago • 0 comments

There are lots of small .bin files to download when launching certain commands, which is slow

  • [ ] Build tools as static when possible to avoid downloading lots of small .bin files when running those tools
  • [ ] Turn python tools into a static binary

.bin files:

  • ViralMSA: 133 [using nuitka: 4 files, ~12MB, ~10s to run each time]
  • man grep: 48
  • nano: 46
  • bowtie2: 38 (because calling bowtie2 runs a Perl script that calls various bowtie2-* binaries; bowtie2-align-s downloads 8 .bin; could do make static-libs && make STATIC_BUILD=1 but perl dependencies are the bigger problem)
  • kraken2: 22

.bin files < 15:

  • vim.tiny: 14
  • ivar: 10
  • samtools: 9
  • bcftools: 8
  • htsfile/tabix/bgzip: 8
  • bedtools: 7
  • lumpy: 7
  • tn93: 5
  • jq: 4
  • kalign: 4
  • lsd2: 4
  • git: 3
  • minimap2: 3
  • tree: 3
  • seqtk: 3
  • fasttree: 2
  • freebayes: 1 (meson build -Dstatic=true -Dprefer_system_deps=false --buildtype release, otherwise get 20 .bin)
  • mummer: 1
  • csvtk: 1
  • kallisto: 1
  • fastp: 1

Other tools

Jellyfish

Downloads 15 .bin files. Compiling from source works but --enable-all-static causes undefined reference to BZ2_bzBuffToBuffDecompress'` errors.

# Compile Jellyfish from source as a static binary
# Compiling from source instead of the release requires running `autoreconf` / installing other deps and gave more errors.
# Using `apt-get install -y jellyfish` works but requires ~15 .bin files.
# Need `--without-sse`, otherwise get "error: impossible constraint in 'asm'".
RUN curl -L -O https://github.com/gmarcais/Jellyfish/releases/download/v2.3.1/jellyfish-2.3.1.tar.gz && \
    tar xvzf "jellyfish-2.3.1.tar.gz" && \
    cd jellyfish-2.3.1 && \
    ./configure --without-sse --enable-all-static && make && make install

robertaboukhalil avatar Aug 30 '23 22:08 robertaboukhalil