rattler-build icon indicating copy to clipboard operation
rattler-build copied to clipboard

Building from a local sdist file url broken in 0.21.0

Open beenje opened this issue 1 year ago • 2 comments

Support for local source file url scheme was added in https://github.com/prefix-dev/rattler-build/pull/177 and working in version 0.5.0.

I hadn't tested that in a while. When trying to build a recipe using a local file as source with rattler-build 0.21.0, it fails.

Issue can be reproduced with:

context:
  version: "13.4.2"

package:
  name: "rich"
  version: ${{ version }}

source:
  - url: file:///tmp/rich/rich-13.4.2.tar.gz
    sha256: d653d6bccede5844304c605d5aac802c7cf9621efd700b46c7ec2b51ea914898

build:
  # Thanks to `noarch: python` this package works on all platforms
  noarch: python
  script:
    - python -m pip install . -vv --no-deps --no-build-isolation

requirements:
  host:
    - pip
    - poetry-core >=1.0.0
    - python 3.10
  run:
    # sync with normalized deps from poetry-generated setup.py
    - markdown-it-py >=2.2.0
    - pygments >=2.13.0,<3.0.0
    - python 3.10
    - typing_extensions >=4.0.0,<5.0.0

tests:
  - python:
      imports:
        - rich
      pip_check: true

about:
  homepage: https://github.com/Textualize/rich
  license: MIT
  license_file: LICENSE
  summary: Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal
  description: |
    Rich is a Python library for rich text and beautiful formatting in the terminal.

    The Rich API makes it easy to add color and style to terminal output. Rich
    can also render pretty tables, progress bars, markdown, syntax highlighted
    source code, tracebacks, and more — out of the box.
  documentation: https://rich.readthedocs.io
  repository: https://github.com/Textualize/rich
$ rattler-build build
...
 ╭─ Running build for recipe: rich-13.4.2-pyh4616a5c_0
 │
 │ ╭─ Fetching source code
 │ │ Validated SHA256 values of the downloaded file!
 │ │ Using local source file.
 │ │ Copying source from url: "/tmp/rich/rich-13.4.2.tar.gz" to "/tmp/rich/output/bld/rattler-build_rich_1725435886/work"
...
 │ ╭─ Running build script
 │ │ + python -m pip install . -vv --no-deps --no-build-isolation
 │ │ Using pip 24.2 from $PREFIX/lib/python3.10/site-packages/pip (python 3.10)
 │ │ Non-user install because user site-packages disabled
 │ │ Ignoring indexes: https://pypi.org/simple
 │ │ Created temporary directory: /tmp/pip-build-tracker-l3rer2po
 │ │ Initialized build tracking at /tmp/pip-build-tracker-l3rer2po
 │ │ Created build tracker: /tmp/pip-build-tracker-l3rer2po
 │ │ Entered build tracker: /tmp/pip-build-tracker-l3rer2po
 │ │ Created temporary directory: /tmp/pip-install-149wgmke
 │ │ ERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
...
$ ls /tmp/rich/output/bld/rattler-build_rich_1725435886/work
build_env.sh  conda_build.sh  rich-13.4.2.tar.gz

The local file was copied to the work directory but wasn't unarchived.

beenje avatar Sep 04 '24 07:09 beenje

Thanks! There are a few workarounds, of course (e.g. making pip unarchive the file). I would also be interested if path: /tmp/rich-.tar.gz works differently?

Lastly, I do think you are right and this file should be un-archived to have the same behavior as fetching from a URL.

wolfv avatar Sep 04 '24 08:09 wolfv

It's working with path: :-)

 ╭─ Running build for recipe: rich-13.4.2-pyh4616a5c_0
 │
 │ ╭─ Fetching source code
 │ │ Fetching source from path: "/tmp/rich/rich-13.4.2.tar.gz"
 │ │ Extracted to "/tmp/rich/output/bld/rattler-build_rich_1725441621/work"
 │ │
 │ ╰─────────────────── (took 0 seconds)

We can see in the logs that it is extracted.

Using path: instead of url: file:// is fine for me.

Would still be nice to fix the file url behaviour as you said.

beenje avatar Sep 04 '24 09:09 beenje

Same here, but unfortunately path: also fails:

  • An url: key followed by a file:// URL fails to build:
    source:
      url: file:///path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
    
    rattler-build just copies the ZIP file, does not unarchive it
     │ ╭─ Fetching source code
     │ │ Validated SHA256 values of the downloaded file!
     │ │ Using local source file.
     │ │ Copying source from url: "/path/to//matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip" to "/tmp/channel/bld
     │ │ /rattler-build_matlab-runtime_1728648393/work"
     │ │
     │ ╰─────────────────── (took 69 seconds)
    
  • An url: key followed by an https:// URL works just fine:
    source:
      url: https://ssd.mathworks.com/supportfiles/downloads/R2019b/Release/9/deployment_files/installer/complete/glnxa64/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
    
    rattler-build unarchives the ZIP file
     │ ╭─ Fetching source code
     │ │ Validated SHA256 values of the downloaded file!
     │ │ Found valid source cache file.
     │ │ Using extracted directory from cache: "/tmp/channel/src_cache/MATLAB_Runtime_R2019b_Update_9_glnxa64_d213e296"
     │ │ Copying source from url: "/tmp/channel/src_cache/MATLAB_Runtime_R2019b_Update_9_glnxa64_d213e296" to "/tmp/channel/bld/rattler-
     │ │ build_matlab-runtime_1728648935/work"
     │ │
     │ ╰─────────────────── (took 32 seconds)
    
  • A path: key initially seemed to work equally fine, but rattler-build keeps unarchiving forever:
      source:
        path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
    
    rattler-build attempts to unarchive the ZIP file, but unzipping lasts forever...
     │ ╭─ Fetching source code
     │ │ Fetching source from path: "/path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip"
     │ │ ⠤ Extracting zip       [00:04:53] [━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╾──────] 2.16 GiB @ 7.56 MiB/s  
    

The issue ~~is probably~~ might be that rattler-build is unable to handle ZIP files larger than 2 GB:

$ ls -lh /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
-rwxrwx---+ 1 username nogroup 2.6G Aug 12  2021 /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
$ 

DimitriPapadopoulos avatar Oct 11 '24 12:10 DimitriPapadopoulos

@DimitriPapadopoulos thank you for the detailed write-up!

~~4:53 doesn't sound like forever to me. Also the indicator is still going at 7.50 MiB/s. I am wondering if it's just slow? Do you have a reference for how long it should take to extract?~~

Ah, I see that in the URL case it takes only 30 seconds so something is wrong. I'll have to take a look.

wolfv avatar Oct 11 '24 13:10 wolfv

While /path/to is indeed on a network (NFS) share, our workstations have 1 Gb/s network interfaces and our storage infrastructure is a CephFS cluster with quite decent throughput:

$ rsync --progress /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip /tmp/
MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
  2,786,688,287 100%  448.76MB/s    0:00:05 (xfr#1, to-chk=0/1)
$ 

DimitriPapadopoulos avatar Oct 11 '24 14:10 DimitriPapadopoulos

I'm not used to building/running Rust programs, but chances are function extract_zip stalls in our context:

extract_zip
/// `.zip` files archived with compression other than deflate would fail.
pub(crate) fn extract_zip(
    archive: impl AsRef<Path>,
    target_direcextract_ziptory: impl AsRef<Path>,
    log_handler: &LoggingOutputHandler,
) -> Result<(), SourceError> {
    let archive = archive.as_ref();
    let target_directory = target_directory.as_ref();
    fs::create_dir_all(target_directory)?;

    let len = archive.metadata().map(|m| m.len()).unwrap_or(1);
    let progress_bar = log_handler.add_progress_bar(
        indicatif::ProgressBar::new(len)
            .with_finish(indicatif::ProgressFinish::AndLeave)
            .with_prefix("Extracting zip")
            .with_style(log_handler.default_bytes_style()),
    );

    let mut archive = zip::ZipArchive::new(progress_bar.wrap_read(
        File::open(archive).map_err(|_| SourceError::FileNotFound(archive.to_path_buf()))?,
    ))
    .map_err(|e| SourceError::InvalidZip(e.to_string()))?;

    let tmp_extraction_dir = tempfile::Builder::new().tempdir_in(target_directory)?;
    archive
        .extract(&tmp_extraction_dir)
        .map_err(|e| SourceError::ZipExtractionError(e.to_string()))?;

    move_extracted_dir(tmp_extraction_dir.path(), target_directory)?;
    progress_bar.finish_with_message("Extracted...");

    Ok(())
}

Could it be that MATLAB_Runtime_R2019b_Update_9_glnxa64.zip is "archived with compression other than deflate"?

DimitriPapadopoulos avatar Oct 11 '24 14:10 DimitriPapadopoulos

Would you be able to try with the file on the same filesystem? It could be related to NFS, after all.

wolfv avatar Oct 11 '24 14:10 wolfv

Will try next week.

By the way, the compression method is either defX or stor for all entries in the ZIP file, nothing exotic here:

$ zipinfo -l /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip | grep -v -e ' defX '  -e ' stor '
Archive:  /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
Zip file size: 2786688287 bytes, number of entries: 5487
5487 files, 2989357849 bytes uncompressed, 2785399227 bytes compressed:  6.8%
$ 

DimitriPapadopoulos avatar Oct 11 '24 14:10 DimitriPapadopoulos

My workstation was updated from Ubuntu 22.04 to Ubuntu 24.04 a few days ago. I wonder whether a filesystem issue could plague it. After "heavy use" (typically running rattler-build to build from simple but large sources) Google Chrome starts complaining (without reason) about invalid site certificates or identifies other sites as non-existent. I couldn't find anything suspicious in the system logs. I will try on a machine still running Ubuntu 22.04, this might be totally unrelated to rattler-build — could be a Linux kernel bug.

DimitriPapadopoulos avatar Oct 11 '24 14:10 DimitriPapadopoulos

That sounds strange. rattler-build itself should not modify anything system-wide. Of course, I don't know what the build scripts are doing.

wolfv avatar Oct 11 '24 14:10 wolfv

Oh, I mean it wouldn't be a rattler-build issue, rather a Linux kernel bug triggered by something specific to rattler-build operation, perhaps manipulating lots of hardlinks.

DimitriPapadopoulos avatar Oct 11 '24 14:10 DimitriPapadopoulos

The scripts are very simple, they just unzip and don't event test. For example: https://github.com/neurospin/neuro-forge/pull/15/files

DimitriPapadopoulos avatar Oct 11 '24 15:10 DimitriPapadopoulos

My issue was probably a Linux kernel issue, or more generally a system issue. Today, ZIP extraction works just fine, either from the local file system:

 │ ╭─ Fetching source code
 │ │ Fetching source from path: "/tmp/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip"
 │ │ Extracted zip to "/tmp/channel/bld/rattler-build_matlab-runtime_1728884183/work"
 │ │
 │ ╰─────────────────── (took 32 seconds)

or the NFS share:

 │ ╭─ Fetching source code
 │ │ Fetching source from path: "/path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip"
 │ │ Extracted zip to "/tmp/channel/bld/rattler-build_matlab-runtime_1728885071/work"
 │ │
 │ ╰─────────────────── (took 31 seconds)

DimitriPapadopoulos avatar Oct 14 '24 05:10 DimitriPapadopoulos

Unfortunately, I am again having freezing issues with path: pointing to an NFS share. Yet, unzipping from that same NFS share works without problem:

$ time unzip /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
Archive:  /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
  inflating: sys/os/glnxa64/libgcc_s.so.1  
  inflating: sys/os/glnxa64/README.libstdc++  
    linking: sys/os/glnxa64/libstdc++.so.6  -> libstdc++.so.6.0.22 
  inflating: sys/os/glnxa64/libstdc++.so.6.0.22  
 extracting: sys/java/jre/glnxa64/jre/LICENSE  
 extracting: sys/java/jre/glnxa64/jre/bin/ControlPanel  
 .
 .
 .
 .
 .
  inflating: productdata/35212.txt   
finishing deferred symbolic links:
  sys/os/glnxa64/libstdc++.so.6 -> libstdc++.so.6.0.22
  bin/glnxa64/libcrypto.so.1 -> libcrypto-mw.so.1.1
  bin/glnxa64/libssl.so.1 -> libssl-mw.so.1.1

real	0m28,605s
user	0m24,789s
sys	0m3,682s
$ 

I don't see anything relevant in the system logs.

DimitriPapadopoulos avatar Oct 29 '24 17:10 DimitriPapadopoulos

Hmm, maybe we need to use a BufferReader or something like that somewhere ...

wolfv avatar Oct 29 '24 17:10 wolfv

@DimitriPapadopoulos it was indeed missing a BufReader: https://github.com/prefix-dev/rattler-build/pull/1144 - I believe this will help nicely in your case.

wolfv avatar Oct 29 '24 17:10 wolfv

@wolfv Thank you very much for looking into this issue. I don't know much about Rust, I understand it provides unbuffered I/O by default and that unbuffered I/O can be slow due to repeated system calls. Yet progress_bar.wrap_read really felt like it was frozen. Any way, I probably won't have time to test a specific commit, but I will make sure to test the next release. Again, than you very much.

DimitriPapadopoulos avatar Oct 30 '24 06:10 DimitriPapadopoulos

@DimitriPapadopoulos - the progress bar is just for showing the progress. The main problem was the unbuffered read which will result in many more system calls and generally be slow. I am very sure that this can be exaggerated by slow disk / NFS filesystems. We already had this optimization for the Tar-file reader but missed it for Zip.

I already made the release so you can try out 0.28.2 whenever you have time. I am quite sure that it should give you a decent improvement :)

wolfv avatar Oct 30 '24 07:10 wolfv

Just upgraded to 0.28., it's still slow. The throughput shown by the progress bar keeps dropping forever:

 ╭─ Running build for recipe: matlab-runtime-9.7-9-hb0f4dca_0
 │
 │ ╭─ Fetching source code
 │ │ Fetching source from path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
 │ │ ⠦ Extracting zip       [00:00:13] [━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╾] 2.54 GiB @ 195.72 MiB/s
 │ │ Fetching source from path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
 │ │ ⠉ Extracting zip       [00:01:13] [━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╾─] 2.49 GiB @ 34.93 MiB/s
 │ │ Fetching source from path: /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
 │ │ ⠦ Extracting zip       [00:33:10] [━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╾─────────] 1.95 GiB @ 1.00 MiB/s

DimitriPapadopoulos avatar Oct 30 '24 07:10 DimitriPapadopoulos

argh. Just to be sure - 0.28.2, right?

wolfv avatar Oct 30 '24 08:10 wolfv

Yes, it's 0.28.2 (I forgot to copy/paste the output of --version):

$ rattler-build --version
rattler-build 0.28.2
$ 

DimitriPapadopoulos avatar Oct 30 '24 09:10 DimitriPapadopoulos

Image

When I start rattler-build, I see:

  • a surge of CPU use (with one proc at 100 %) without much network traffic,
  • then (receiving) network traffic kicks in and eventually oscillates well under 1000 KiB/s and CPU use drops to almost nothing (see screen capture), while the throughput displayed by rattler-build drops drastically,
  • when I forcibly stop rattler-build with Ctrl+C, network traffic immediately drops to 0.

In short, at the system level, network resources are not used as they should. When running unzip /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip, receiving network traffic steadily peaks at ~ 80 MiB/s which is consistent with the 1 Gb/s link of the workstation.

Nothing in the system logs.

Note that file /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip is > 2 GB, but then it's not a problem with path: /tmp/... or url: https://....

DimitriPapadopoulos avatar Oct 30 '24 10:10 DimitriPapadopoulos

Where is your output folder located and the corresponding src_cache folder? Is that also on the network drive? I am not really sure what we're doing wrong .. I had high hopes for the BufReader! :)

wolfv avatar Oct 30 '24 11:10 wolfv

The output dir is /tmp/channel, it's the local disk.

DimitriPapadopoulos avatar Oct 30 '24 11:10 DimitriPapadopoulos

Now about the cache. We used to have home dirs on NFS servers, but that's not the case any more. Besides, even with home dirs on NFS servers, we used to point the environment variable XDG_CACHE_HOME to local disk. Where it gets interesting is that I run rattler-build though the script of a colleague which executes, env HOME=/tmp/channel rattler-build in an effort to make doubly sure the cache is local. Let me try to skim that:

Initial command:

env HOME=/tmp/channel rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel --experimental -c conda-forge -c bioconda

Skimmed down command:

rattler-build build -r /local/disk/recipes/matlab-runtime-9.7 --output-dir /tmp/channel -c conda-forge

Unfortunately it remains as slow as before. I'm not sure how to further investigate. Do you have a Rust code snippet that unzips a file I could try to build and test locally? I wouldn't be suprised if it were a Rust bug.

DimitriPapadopoulos avatar Oct 30 '24 12:10 DimitriPapadopoulos

What does progress_bar.wrap_read really do? Could it be that it somehow adversely affects disk reads?

DimitriPapadopoulos avatar Oct 30 '24 12:10 DimitriPapadopoulos

I kicked off a build that you could try for debugging: https://github.com/prefix-dev/rattler-build/pull/1146 ...

And when you run unzip locally, you also extract to that same /tmp/... folder?

wolfv avatar Oct 30 '24 13:10 wolfv

Yuo can find the binaries here: https://github.com/prefix-dev/rattler-build/actions/runs/11594061424?pr=1146

wolfv avatar Oct 30 '24 13:10 wolfv

I unzip in /tmp:

$ mkdir /tmp/channel
$ 
$ cd /tmp/channel/
$ 
$ time unzip /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip 
Archive:  /path/to/matlab-runtime/MATLAB_Runtime_R2019b_Update_9_glnxa64.zip
  inflating: sys/os/glnxa64/libgcc_s.so.1  
  inflating: sys/os/glnxa64/README.libstdc++  
  .
  .
  .
  inflating: productdata/35212.txt   
finishing deferred symbolic links:
  sys/os/glnxa64/libstdc++.so.6 -> libstdc++.so.6.0.22
  bin/glnxa64/libcrypto.so.1 -> libcrypto-mw.so.1.1
  bin/glnxa64/libssl.so.1 -> libssl-mw.so.1.1

real	0m44,915s
user	0m30,763s
sys	0m6,814s
$ 

DimitriPapadopoulos avatar Oct 30 '24 14:10 DimitriPapadopoulos

I do see a x86_64-unknown-linux-musl build, but am not sure how to install/run locally (I am new to Rust). Is it as simple as git clone and cargo build?

EDIT: Ah, just found the binaries.

DimitriPapadopoulos avatar Oct 30 '24 14:10 DimitriPapadopoulos