briefcase icon indicating copy to clipboard operation
briefcase copied to clipboard

Docker QoL

Open rmartin16 opened this issue 3 years ago • 8 comments

Changes

  • Decouple AppImage and Docker somewhat
    • As part of trying to ensure docker.prepare() had already been run to build the Docker image when needed, I ultimately refactored much of the direct docker coupling out of the AppImage building insofar as possible.
    • There were two modes of running commands in Docker: 1) using the docker object from the with dockerize manager and 2) letting self.subprocess calls downstream be indirectly invoked in docker by virtue of the fact the with dockerize manager swaps out self.subprocess.
    • This normalizes that to simply letting self.subprocess calls run in the desired environment.
  • Ensure Docker image is available for all commands (Fixes #796)
    • As part of that decoupling, preparing the docker environment is handled in the dedicated docker code path instead of trying to strategically do it within the business logic.
    • It tracks whether prepare() has been called for the current app and only attempts to build the Docker image for that app once per briefcase invocation.

Depends on:

  • #810

PR Checklist:

  • [x] All new features have been tested
  • [x] All new features have been documented
  • [x] I have read the CONTRIBUTING.md file
  • [x] I will abide by the code of conduct

rmartin16 avatar Jul 26 '22 04:07 rmartin16

PRs can be chained

Ahh, ok....I wasn't sure how GIthub supported that type of workflow but that definitely makes sense. All these changes mostly grew organically out of each other (and there were other changes I ultimately removed) but by the time I was typing "this is totally unrelated but" I should have started creating more PRs :)

we should replace the class variable with something that actively inspects the docker repository

I can get behind that. However, I think an explicit decision point about system_requires may be appropriate. If docker build doesn't run and system_requires has changed, then the new system requirements won't be installed.

Is the create command the way users should apply changes from system_requires?

I've still been trying to work out the exact scope of each briefcase command....but I think this one may just be a little special since (i believe) users are expected to manage system requirements independently of briefcase when docker isn't in play.

rmartin16 avatar Aug 03 '22 22:08 rmartin16

we should replace the class variable with something that actively inspects the docker repository

I can get behind that. However, I think an explicit decision point about system_requires may be appropriate. If docker build doesn't run and system_requires has changed, then the new system requirements won't be installed.

Is the create command the way users should apply changes from system_requires?

I've still been trying to work out the exact scope of each briefcase command....but I think this one may just be a little special since (i believe) users are expected to manage system requirements independently of briefcase when docker isn't in play.

That's an interesting point. The current implementation builds the Docker environment as part of create on the basis of "creating an environment in which we can build", but the use case you've described is definitely problematic. Although system_requires doesn't change often, we need to detect those changes and apply them to the environment; and I wouldn't consider adding a system package to support a new Python package to be a step that should require a full re-create.

That would suggest we should move the Docker build into the build step. That would result in the build always happening in an environment that matches the current system_requires declaration. In the case of system_requires not changing, Docker's caching means the operation is fairly fast. If you've got 2 projects with the same ID, but slightly different system_requires declarations, both builds will be cached, but the tag will switch between the two hashes.

In the case of a direct briefcase run, there's only one call to build the environment; so we don't need to do any tracking or detection of "has this been built". We always build, once, in the build step; some builds are an effective no-op.

freakboy3742 avatar Aug 04 '22 00:08 freakboy3742

I added a step at the beginning of briefcase build to ensure the Docker image is up-to-date. All other code paths that require the Docker image will only attempt to build it if it doesn't appear to already exist via a call to docker images.

I also noticed something else....I'm not sure I understand how briefcase update -d should work for Docker AppImage builds. Currently, its pip install call doesn't run inside Docker like the pip install call during briefcase create does. However, even when I have it run within Docker to add a new dependency, it doesn't work (as in the app can't use the added dependency). I'll probably take a further look at this later but wanted to mention it here.

[edit rmartin16] I stepped through the whole Docker AppImage build again and briefcase update -d does appear to be working for me now....must have been doing something wrong previously. I then wondered if briefcase update -d should force rebuild the docker image as well to include installing changes to system_requires since they're logically "dependencies" here....however, I believe that briefcase build will be necessary to meaningful use the newly installed python dependencies, so perhaps that would be overkill or at least unnecessary.

rmartin16 avatar Aug 07 '22 20:08 rmartin16

Thank you for the perspective; I was honestly starting to feel like Moses wandering in the desert with this.

I had to read through it a few times to make sure I'm following but I think your suggestion really gets at what I wanted to do initially: only build the Docker environment the first time it's needed for any command.

I had noticed the state object being managed in the high-level command code....but I guess I never put altogether how it was intended to be used. This definitely seems like a good fit.

Insofar as using a direct "Subprocess/Docker" object to proxy commands, my initial concern is install_app_dependencies() during create....perhaps that will still require the orig_subprocess dance.

rmartin16 avatar Aug 08 '22 19:08 rmartin16

I had to read through it a few times to make sure I'm following but I think your suggestion really gets at what I wanted to do initially: only build the Docker environment the first time it's needed for any command.

That's the core of it, yes.

I had noticed the state object being managed in the high-level command code....but I guess I never put altogether how it was intended to be used. This definitely seems like a good fit.

FWIW, the test suite make much more use of this feature; it's used in the command tests to track what work has been done by the commands. However, the option is there to use the same feature to pass context between commands as needed.

Insofar as using a direct "Subprocess/Docker" object to proxy commands, my initial concern is install_app_dependencies() during create....perhaps that will still require the orig_subprocess dance.

Sure - but the Docker object is a wrapper around a subprocess instance, so once you've got a Docker instance, you know you've got a subprocess object you can use. All I'm suggesting is that L219-221 be factored out (almost literally) into a method.

freakboy3742 avatar Aug 09 '22 00:08 freakboy3742

The code is finalized; however, I want to do some more testing....especially around multi-app projects. Should be able to do that tomorrow.

rmartin16 avatar Aug 12 '22 01:08 rmartin16

I feel like I have some mental block with this PR so wanted to see if this moves us closer towards a final solution.

I moved the build environment management up in to BaseCommand. With this approach, any Command implementation can call verify_build_environment() at will and retrieve a previously prepared subprocess or have one created on the fly; therefore, calling it in CreateCommand.create_app() isn't strictly necessary.

Although, whether CreateCommand.create_app() should be responsible for returning the build subprocesses in state or if the specific implementation should do it seems debatable. Given the nature of build_subprocesses seems like bit of a new concept to be tracked by a Command, it may even be appropriate for base.full_options() to include build_subprocesses in state all the time.

Anyway, please let me know if this is more in line with what you're imagining. I still need to flesh out the testing....and I'm not married to any of this naming. Thanks.

rmartin16 avatar Aug 15 '22 23:08 rmartin16

This makes sense....albeit a bit of ballooning of scope. I'll try to at least get a skeleton together to help flesh out this idea.

rmartin16 avatar Aug 19 '22 20:08 rmartin16

sdk_manager isn't especially fast, so once you've run sdk_manager --list_packages, it would be desirable to keep that cache for later use.

This specific example no longer applies now that #832 has been merged, but I agree it's confusing to have multiple instances of AndroidSDK in existence. See the notes at https://github.com/beeware/briefcase/pull/832#issuecomment-1220434236.

mhsmith avatar Aug 20 '22 12:08 mhsmith

Update: I'll have time this week to return to this.

rmartin16 avatar Aug 28 '22 22:08 rmartin16

Oh, my apologies; that last recent commit was only intended as a merge commit from main. (And to that end, I'm experimenting with strategies other than rebase/force-push for catching a branch up to main....open to any resources you recommend since there seems to be plenty of debate in this area of git.)

Nonetheless, in the future, I will be sure to provide a "current status" update for any changes such that whether a review is currently necessary is not ambiguous.

I worked on creating the ToolCache today which allows instantiated tools to persist from one command to the next. This initial implementation, though, only uses the ToolCache to "retrieve" a tool and bind it to the command in the same way it does today. This feels unnecessary (but allowed a lot less code to change to realize it working); so, I'll probably create "verify" methods on the ToolCache (such as verify_git()) that'll make the tool available via command.tools.<tool_name>. That'll help ensure that tools are also verified before they are used....lest we start running a command only to verify a tool in the middle of execution.

I did notice a lot of the tools bind the current command. Given this change would allow tools to be used for multiple commands, this feels a bit like a ticking time bomb....but in practice, the tools seem to only use command-agnostic properties/methods...

Finally, I still need to work in the "app-specific" tooling. I will start on that tomorrow.

Sorry again about the confusion.

rmartin16 avatar Sep 05 '22 02:09 rmartin16

Oh, my apologies; that last recent commit was only intended as a merge commit from main. (And to that end, I'm experimenting with strategies other than rebase/force-push for catching a branch up to main....open to any resources you recommend since there seems to be plenty of debate in this area of git.)

When a PR has already been seen by someone else, I think a normal merge from main is the right answer. And it works well with GitHub's review system, because when you click the "view changes since I last reviewed" link, it omits changes which were merged from main, which is exactly what you want.

mhsmith avatar Sep 05 '22 07:09 mhsmith

@freakboy3742, @mhsmith

Updates

  • General skeleton of a ToolCache is implemented.
  • All integrations/tools are provided by the ToolCache.
  • The concept of "app-specific tools" is introduced and made available to all commands and platforms.

Design

  • ToolCache
    • The cache is available via command.tools.
    • The ToolCache facilitates:
      • verifying a tool via a tools.verify_<tool>() call
      • exposing a tool's interface
    • As such, each command must verify a tool before attempting to use it.
    • The only new tool is build_subprocess.
      • This tool is not named after the Command; instead, this tool is named to communicate that this object provides the subprocess interface to the build environment (which may not be the native/host environment).
      • In this way, it represents where the action is taken...and not the context it's occurring in.
    • The current app is being tracked within this cache....I'm not particularly satisfied with this yet...have to think about it more.
  • App-Specific Tools
    • Alongside the existing verify_tools() method, verify_app_tools() is introduced.
    • Any command with app-specific tools should override this method to verify the tool is available.

Feedback on the big picture is welcome and desired. I wouldn't bother reviewing the code too closely yet as it is not final and I haven't tested most workflows.

rmartin16 avatar Sep 06 '22 01:09 rmartin16

It may also be worth moving cookiecutter, requests, os, sys, stdlib_platform and shutils into ToolsCache() as tools that exist without needing verification. [...] One additional argument in favour of this - the stdlib_platform name exists specifically because of the name clash with command.platform as an attribute tracking the platform that the command is running on.

Since we're revisiting this area: is it necessary to have attributes for external modules like os and sys purely for the purpose of intercepting them in unit tests? Why not just use pytest's monkeypatch on the external module directly?

mhsmith avatar Sep 08 '22 08:09 mhsmith

This makes sense; thanks. I don't think I spent long enough evaluating different approaches before attempting to basically retrofit a cache. This approach is definitely more holistic.

This is what I distilled from your feedback:

  • Regress ToolCache to simple container
  • Refactor Tool verification from Command dependency to ToolCache dependency
  • Introduce separate ToolCache for app-specific tools
  • Integrate Docker tool use in to app ToolCache

RE: moving the stdlib and third party modules in to command.tools I agree with @mhsmith; this feels somewhat strange to me (perhaps like cover for deficiencies elsewhere)...however, having mocked some of the imported stdlib modules for subprocess.py via monkey patching, it seems like a lot more trouble than just replacing command.os.environ, for instance.

Just for fun....Hynek has thoughts about mocking other people's stuff at all.

rmartin16 avatar Sep 08 '22 19:09 rmartin16

Sounds like we're on the same page regarding the design of ToolCache.

As for the mocking thing - the current setup (modules cached on the Command object) was a vague attempt to actually do the sort of thing that Hynek was proposing. The theory was that by making os, sys, etc attributes of command, they become an "API that we control", albeit one that has exactly the same API surface as the "raw" module.

In practice... I'm not sure it's been as successful as I hoped. I'm not opposed to reverting to using "raw" system modules and monkeypatch if others feel that makes more sense.

freakboy3742 avatar Sep 08 '22 23:09 freakboy3742

WiP Update This is not ready for review yet but I've implemented the intended changes. I still need to run through the workflows on Windows and macOS (but everything works on Linux).

Once I've verified things at least superficially work for each platform, I'll provide an update here calling out the more opinionated parts of the implementation for feedback and then get started on updating all the tests.

rmartin16 avatar Sep 16 '22 23:09 rmartin16

The ToolCache implementation is complete and ready for review. I am working on making tests operational now.

rmartin16 avatar Sep 18 '22 18:09 rmartin16

ooooook....after a bit of a tour de force through the tests, they're actually running to completion again. However, they still need additional updates to cover new code. stay tuned.

rmartin16 avatar Sep 22 '22 01:09 rmartin16

ooooook....after a bit of a tour de force through the tests, they're actually running to completion again. However, they still need additional updates to cover new code. stay tuned.

/me hands out a cup of Gatorade to the runner and cheers as he goes past :-)

freakboy3742 avatar Sep 22 '22 02:09 freakboy3742

nearly there. everything is in place but refinement is needed in some places. hoping to wrap this up tomorrow.

rmartin16 avatar Sep 23 '22 01:09 rmartin16

Codecov Report

Merging #801 (a7f8d47) into main (6d8b6e8) will increase coverage by 0.21%. The diff coverage is 99.06%.

Impacted Files Coverage Δ
src/briefcase/platforms/windows/app.py 95.34% <50.00%> (ø)
src/briefcase/platforms/windows/visualstudio.py 95.34% <50.00%> (ø)
src/briefcase/integrations/git.py 66.66% <83.33%> (+11.11%) :arrow_up:
src/briefcase/integrations/xcode.py 97.16% <93.33%> (+0.14%) :arrow_up:
src/briefcase/integrations/android_sdk.py 99.32% <98.82%> (+<0.01%) :arrow_up:
src/briefcase/cmdline.py 100.00% <100.00%> (ø)
src/briefcase/commands/base.py 98.21% <100.00%> (-0.40%) :arrow_down:
src/briefcase/commands/build.py 100.00% <100.00%> (ø)
src/briefcase/commands/create.py 99.70% <100.00%> (+<0.01%) :arrow_up:
src/briefcase/commands/dev.py 95.31% <100.00%> (+0.07%) :arrow_up:
... and 24 more

codecov[bot] avatar Sep 23 '22 22:09 codecov[bot]

I need a stiff drink and a lie down... :-)

you're telling me man :)

Thanks for sticking it out with me. I think we finally made it....and with even better coverage than we started.

Let me know if I trimmed too much meat out of the AppImage tests wrt app tool verification.

Also, I had to move where verify_app_tools() is called in each command for a final time. I kept switching between running app tool verification before or after per-requisite commands ran. Landed on after given the requirement for part of the CreateCommand to run before tools can be verified.

rmartin16 avatar Sep 24 '22 00:09 rmartin16