data-prep-kit icon indicating copy to clipboard operation
data-prep-kit copied to clipboard

Improve transforms venv

Open revit13 opened this issue 1 year ago • 1 comments
trafficstars

Search before asking

  • [X] I searched the issues and found no similar issues.

Component

Transforms/Other

Feature

At the moment, each transform copies the data-processing-lib libraries and installs them into its transform venv. This step can be time-consuming, particularly in a CI/CD setting. To expedite this process the suggested approach is to avoid copying the libraries altogether and instead only update them if their content has changed. This can be achieved by utilizing a shared environment (venv) for all transformers and copying the data-preparation libraries only when necessary.

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

revit13 avatar Jul 01 '24 18:07 revit13

It seems that the following approach could help:

  1. Create a shared venv in the main repo where the data-prep-lab libraries are installed. This shared venv will be re-created only when the sources in this data-prep-lab libraries are changed.
  2. create venv in each transform directory as is currently done (under noop/ray, noop/python....). The local transform lib will be installed there. Use PYTHONPATH to point to the shared venv directory.

I tested it locally and it seems to work.

@daw3rd if this approach seems ok with you I can implement that, Thanks

revit13 avatar Jul 08 '24 08:07 revit13