rez icon indicating copy to clipboard operation
rez copied to clipboard

Add Artifact Repository #2

Open BryceGattis opened this issue 1 year ago • 3 comments

This PR picks up from the work that work that @ttrently started in PR #1453.

Requirements

This implementation requires some additional packages that have not been included in rez.vendor.

  • awscli
  • pymongo==4.6.2

These can be installed via python -m pip install PKG --target path\to\rez\Lib\site-packages.

Setup

For testing I setup Mongo and Mongo Express docker containers locally by running the following commands:

  • docker run --detach --name mongo_rez --network rez -p 27017:27017 mongo:7.0.5
  • docker run --detach --name mongo_rez_express --network rez -e ME_CONFIG_MONGODB_SERVER=mongo_rez -p 8081:8081 mongo-express:1.0.2-20

I ran the Mongo Express container so I could easily see the contents of the DB in the browser.

How To Use

In order to use you will need an existing S3 bucket and a MongoDB instance with a named database and a collection called "packages".

In your rezconfig.py you will need to set the following:

packages_path = [
    "mongodb@localhost:27017/db_name",
]

artifacts_path = [
    "s3@s3://bucket-1/packages",
]

default_cachable = True
package_cache_same_device = True
package_cache_async  = False

Please note that the above Mongo URI will not directly be usable by PyMongo in a Python Console.

Steps

  1. Run rez build --install -p mongo@localhost:27017/db_name --clean to build a package and put its definition in the MongoDB. TODO: Running this seems to upload to DB now but install_variant doesn't fully work yet.

Changes

  • Made package caching optionally asynchronous. TODO: Not sure why this choice was made yet.

FUTURE

The following changes are not handled by this PR, but should be kept in mind for the future.

  • Introduce new terminology for variants:
    • Download: Pull a variant payload from the Artifact Repository to disk somehow? TODO: Clarify this.
    • Install: Unpacks the payload if necessary (if it's an archive file format), and caches it using payload caching? TODO: Investigate this.
  • For rez-release do we need to make release_packages_path a list so we can deploy to potentially multiple package repositories at once?

TODO:

  • rez-env can't find packages in the mongo package repository yet.
  • No Credential Provider class as described in https://gist.github.com/JeanChristopheMorinPerso/9a0705d1498378178b53061d90f3b2d4 need some help with modern practices for credentials management and what our workflow should be here.
  • Check if this packs the payload before uploading to the Artifact Repository (convert to .zip)?
  • Figure out how to handle the two new PyPi package requirements:
    • awscli
    • pymongo
  • Hopefully we don't have to add these to the rez/vendor folder, but maybe that's a good first thing to do?
  • Define rez release behavior. Does this push to both to package repository and artifact repository? What does this do if there are multiple of either of those?
  • Need to clarify where exactly ArtifactRepositories belong in the architecture of Rez. Do we need a ArtifactRepository for packages on a shared filesystem or do we only use ArtifactRepositories when they are off premise?
  • Add new tests.

BryceGattis avatar Feb 18 '24 19:02 BryceGattis

Codecov Report

Attention: Patch coverage is 62.96296% with 20 lines in your changes are missing coverage. Please review.

:exclamation: No coverage uploaded for pull request base (main@9c8de33). Click here to learn what that means.

Files Patch % Lines
src/rez/artifact_repository.py 52.94% 16 Missing :warning:
src/rez/package_cache.py 71.42% 1 Missing and 1 partial :warning:
src/rez/rezconfig.py 0.00% 2 Missing :warning:
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1662   +/-   ##
=======================================
  Coverage        ?   58.02%           
=======================================
  Files           ?      129           
  Lines           ?    17133           
  Branches        ?     3505           
=======================================
  Hits            ?     9942           
  Misses          ?     6523           
  Partials        ?      668           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 03 '24 03:03 codecov[bot]

Thought for potential backlog item: Can we use a different data validation library or perhaps update our vendored schema package? The output you get out of it currently is borderline unreadable to a human. For example here's a snippet of a traceback I got while working on this PR:

  File "D:\Code\rez\venv\lib\site-packages\rez\vendor\schema\schema.py", line 235, in validate
    raise SchemaError([None] + x.autos, [e] + x.errors)
rez.vendor.schema.schema.SchemaError: Or(<class 'rez.utils.sourcecode.SourceCode'>, Schema({Optional(<class 'str'>): Or(Or(<class 'str'>, [<class 'str'>]), {Optional(<class 'str'>): <class 'object'>, 'command': Or(<class 'str'>, [<class 'str'>]), Optional('requires'): [Or(<class 'str'>, And(<class 'rez.util
s.formatting.PackageRequest'>, Use(<class 'str'>)))], Optional('run_on'): Or(<class 'str'>, [<class 'str'>]), Optional('on_variants'): Or(<class 'bool'>, {'type': 'requires', 'value': [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))]})})})) did not validate 'None'  
'None' should be instance of {Optional(<class 'str'>): Or(Or(<class 'str'>, [<class 'str'>]), {Optional(<class 'str'>): <class 'object'>, 'command': Or(<class 'str'>, [<class 'str'>]), Optional('requires'): [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))], Optional
('run_on'): Or(<class 'str'>, [<class 'str'>]), Optional('on_variants'): Or(<class 'bool'>, {'type': 'requires', 'value': [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))]})})}

BryceGattis avatar Mar 03 '24 03:03 BryceGattis

@BryceGattis I agree the errors are awful. I have started to look at replacing our schema stuff with something more modern. But it's a difficult task because the schemas leak into our public API is in some places, the rez internals (the core ones) are all very tightly coupled to schemas and it is also very performance sensitive. It is probably positive to swap our schema library with something else without introducing breaking changes, but I'll be a tough task.