oumi icon indicating copy to clipboard operation
oumi copied to clipboard

Add `OUMI_TRY_EDITABLE_INSTALL` env var to do editable Oumi install from source in job configs

Open wizeng23 opened this issue 9 months ago • 3 comments

Description

Inspired by #1359. Our configs install oumi from PyPi for convenience for external users, but it's more friction for developers. If OUMI_TRY_EDITABLE_INSTALL is set to a truthy value ("true", "yes", "1", etc.), we will attempt to modify pip install oumi commands in the setup section of job configs to install editable from source instead (pip install oumi[gpu] -> pip install -e '.[gpu]'). The logic tries to handle many possible edge cases, but may not be perfect. Hopefully this should be fine since it's mainly an experimental flag for development.

This PR also modifies the dev GCP job to do an editable install from source by default.

TODO: Add documentation on the dev environment setup page after we finalize the PR. TODO: Modify to use regex

Tested that this works on a GCP job.

Further justification

This PR is helpful for anyone making core code changes to Oumi that can't be tested locally, ex. it affects a large model. The pain points of doing this currently are:

  • Need to manually modify target configs to do editable install. This is easy to forget (at least for me) since we've only recently started installing from PyPi.
  • When making a commit, the above all modifications have to be undone first, then added back later if the PR isn't complete.

Pros of this PR:

  • Allow devs to have editable install by default. IMO the only time I'd use the PyPi package is to repro a user-reported issue.
  • Since the logic is gated by an env var, it shouldn't affect anyone who doesn't enable it

Cons of this PR:

  • Hacky in its current form and potentially error-prone (regex may be better)
  • Code bloat, as this may not be useful for the majority of users.

Related issues

Fixes OPE-1043.

Before submitting

  • [ ] This PR only changes documentation. (You can ignore the following checks in that case)
  • [x] Did you read the contributor guideline Pull Request guidelines?
  • [x] Did you link the issue(s) related to this PR in the section above?
  • [x] Did you add / update tests where needed?

wizeng23 avatar Feb 11 '25 00:02 wizeng23

Agree with 2. For 1, while I like your approach in the code you linked, I'm wondering if it doesn't work as well in this case. I imagine two cases:

  1. We assume that everyone using a dev version of Oumi wants to enact this logic. However, what if that's not the case for some reason, and they have no way to disable this logic?
  2. We add a CLI prompt every time the user kicks off a job, which feels like too much friction to me. We could save the answer after they respond for the first time, but to me that feels equivalent to the env var approach. I guess to me, the difference is that for your working_dir code, the prompt would be to catch an edge case, whereas here, this is a core question that devs face when using the job launcher.

wizeng23 avatar Feb 12 '25 00:02 wizeng23

Appreciate everyone's detailed feedback! After my oncall, I'll switch to regex and address everyone's feedback before switching out of draft mode.

wizeng23 avatar Feb 18 '25 08:02 wizeng23

Appreciate everyone's detailed feedback! After my oncall, I'll switch to regex and address everyone's feedback before switching out of draft mode.

+1 Would be real good to get this in in some form. Otherwise testing local changes against existing configs is a pain and error-prone .

nikg4 avatar Feb 20 '25 23:02 nikg4