pants icon indicating copy to clipboard operation
pants copied to clipboard

New `package` behaviour can't install some requirements

Open andrii-porokhnavets opened this issue 1 year ago • 4 comments

Describe the bug A clear and concise description of the bug.

I am trying to migrate to pants 2.19.0 from 2.16.0. During migration, I faced with issue with pants package :: command. The same configuration works on 2.16.0 (and 2.17.0). So on 2.19.0 (and on 2.18.0) I got the next error:

18:31:51.34 [INFO] Completed: Build python_google_cloud_function artifact for functions/python/source-python-http:python_http_gcf
18:31:51.34 [ERROR] 1 Exception encountered:

Engine traceback:
  in `package` goal

ProcessExecutionFailure: Process 'Build python_google_cloud_function artifact for functions/python/source-python-http:python_http_gcf' failed with exit code 1.
stdout:

stderr:
Failed to resolve requirements from PEX environment @ /home/andrii/.cache/pants/named_caches/pex_root/unzipped_pexes/78b374064464e37b3fd5cad5252b40c1c7e13119.
Needed cp311-cp311-linux_x86_64 compatible dependencies for:
 1: pyarrow>=3.0.0
    Required by:
      db-dtypes 1.2.0
      pandas-gbq 0.19.0
    But this pex had no ProjectName(raw='pyarrow', normalized='pyarrow') distributions.
 2: pyarrow==14.0.1
    But this pex had no ProjectName(raw='pyarrow', normalized='pyarrow') distributions.



Use `--keep-sandboxes=on_failure` to preserve the process chroot for inspection.

My python_google_cloud_function target:

python_google_cloud_function(
    name="python_http_gcf",
    runtime="python311",
    handler="couplerio/importers/python_http/main.py:handler",
    type="http",
)

My requirements.txt file

requests==2.31.0
pandas==2.0.0
pandas-gbq==0.19.0
numpy==1.24.1
pyarrow==14.0.1
google-cloud-bigquery==3.4.2
google-cloud-bigquery-storage==2.18.1
functions-framework==3.3.0

Pants version 2.19.0

OS Ubuntu 22.04.3

andrii-porokhnavets avatar Feb 09 '24 16:02 andrii-porokhnavets

Hi, can you create a small github repo that exposes this issue? Thanks.

benjyw avatar Feb 10 '24 16:02 benjyw

Hi, can you create a small github repo that exposes this issue? Thanks.

Sure. I created small repo where this issue reproduces as well

andrii-porokhnavets avatar Feb 12 '24 08:02 andrii-porokhnavets

Hi, sorry for the delay in looking in to this. I cannot reproduce in that repo when running on a Mac M1, or in an ARM docker container. But I'm not sure if packages built on those architectures will run on the target x86_64 architecture. @huonw will know more, but I assume it depends on the availability of wheels for all the 3rdparty deps in the cloud function.

I don't have access to an x86_64 machine (and I can't get Pants to work in an x86_64 docker container running on the M1 via emulation), but presumably that is where you're seeing the issue?

benjyw avatar Feb 24 '24 04:02 benjyw

Thanks for the reproducer, unfortunately I'm on the same platform as @benjyw so I can't reproduce the issue either.

I suspect this may be resolved by setting complete_platforms (even if this specific issue is not, it will at least be a more reliable way to package things, for future):

  1. Run something like the following in the GCF environment: this will output a big block of JSON that captures the "complete" information about the execution environment, and thus what packages are compatible:
    subprocess.run(
      """
      pip install --target=/tmp/subdir pex
      PYTHONPATH=/tmp/subdir /tmp/subdir/bin/pex3 interpreter inspect --markers --tags
      """,
      shell=True
    )
    
  2. Place that JSON into your repository and load it into pants with file(name="some-name", source="some-file.json")
  3. Add a complete_platforms=["path/to:some-name]" field to your python_google_cloud_function targets

This will, hopefully, meant that the package is built with an accurate understanding of what the target environment is. Anything less than that is unreliable. https://github.com/pantsbuild/pants/discussions/18756 is also related.

Unfortunately, GCF is undersupported by pants here: for the AWS Lambda, we've been able to provide the complete platform information built-in to pants, so those steps above aren't necessary, but have not been able to do the same for GCF. https://github.com/pantsbuild/pants/issues/18195 covers this. (If you can provide this complete platform information for GCF's current Python 3.11 environment, we can include that, and even better if you can provide it for more, or find us a docker image for the different runtimes 😄 )

So, in summary:

  1. try complete_platforms
  2. if it works, maybe you can help us improve pants

huonw avatar Feb 28 '24 09:02 huonw