gsoc icon indicating copy to clipboard operation
gsoc copied to clipboard

Improve behaviour of `HOMEBREW_INSTALL_FROM_API`

Open MikeMcQuaid opened this issue 2 years ago • 6 comments

HOMEBREW_INSTALL_FROM_API is still in beta and it would be good to improve it as much as we can before releasing it to everyone.

Various fixes need done e.g. fixing issues such as https://github.com/Homebrew/brew/issues/12357 but also it needs more testing and improvements to any rough edges.

MikeMcQuaid avatar Feb 01 '22 14:02 MikeMcQuaid

Thanks for adding this!

The other thing that I've been wanting to focus on (and haven't had the time to do) is to integrate the logic for HOMEBREW_INSTALL_FROM_API better. Right now, it feels to me like this feature is almost hacked together as an add-on to Homebrew, and it's probably not super clear to someone who's not familiar with the code. Plus, this means that there are a handful of weird edge cases that have to be handled (I've run into a lot of these). I'd like to take a step back and think about how to properly handle installing from API and normal installation and try to make them both feel natural with the rest of the Homebrew codebase. I can elaborate more on some of my concerns if desired.

I don't know if this is an achievable goal since to really "do this right" might require some pretty big changes e.g. in Formulary and how formulae are installed. Also, this extension also might not be suitable for participants who aren't very comfortable in the Homebrew codebase since it's hard to figure out how to do things seamlessly if you're not familiar with how it works currently. Figured it would be worth marking that down here anyway, though.

Rylan12 avatar Feb 02 '22 17:02 Rylan12

I can elaborate more on some of my concerns if desired.

Please do!

Also, this extension also might not be suitable for participants who aren't very comfortable in the Homebrew codebase since it's hard to figure out how to do things seamlessly if you're not familiar with how it works currently.

Yeh, it might be tough but we could either guide this work or just spell out some guidelines for improvement.

MikeMcQuaid avatar Feb 02 '22 19:02 MikeMcQuaid

Sorry for the delay here.

At the moment, the way we download the necessary bottle files is all right at the beginning based on the named args that were passed. This is handled in different places for each installation comment. For brew upgrade and brew reinstall, it's done in the command file. The link is here for brew upgrade and here for brew reinstall. However, in brew install, the logic happens here and here in CLI::NamedArgs.

The way it's done in both cases is to run Homebrew::API::Bottle.fetch_bottles which attempts to download all of the bottles and then just tells Formulary to remember that there are these downloaded files. But, all of this is based on the data in the formula bottle API. This doesn't include build dependencies or any os-specific dependency information, meaning that sometimes too many dependencies are downloaded and sometimes not enough is downloaded (i.e. building from source). So, in reality, it feels to me like what's happening is we're saying "oh the use has HOMEBREW_INSTALL_FROM_API set, so let's try to download what we think they'll need ahead of time. In most cases this isn't a problem since we can write off lots of it as "unsupported with HOMEBREW_INSTALL_API" which is totally fine.

So what happens when things go wrong? Well ultimately if we didn't download the right thing, Formulary decides to throw a CoreTapFormulaUnavailableError which is an error that (so far) can only be thrown when HOMEBREW_INSTALL_FROM_API is set. This tells Homebrew "hey this formula is missing but it should be here since it's a core formula" and then leaves it up to the receiver of that error to figure out what to do. Sometimes we just ignore it, and sometimes we decide to just download the bottle at that time anyway.

So my point is that we have conflicting ways to handle downloading the bottles. We get everything at the beginning and assume that we'll use those later on and that they'll be enough, but then sometimes we need to download more things so we just do it anyway. That works okay since a lot of these things are edge cases, but they kind of seem like edge cases that we created as a result of doing things the way that we do. It just feels a bit off to me.


I believe that there should be a way to improve this. One option is probably to just move all of the bottle downloading to be as-needed. So instead of predicting at the beginning, we just download formula bottles when we encounter missing formulae. That way, there don't need to be any edge cases about missing formulae. If Formulary::factory fails, we know that the formulae 100% doesn't exist since we'd have already tried the API and all other options.

Similarly, if we come across a new dependency that we didn't know about before, it's not a problem at all since we just treat it like any other dependency and figure out how to resolve it (including possibly downloading from the API). Essentially, let's keep the API logic out of the installation process, and move it to the formula resolution process.

I don't really remember the reason for downloading everything at once at the beginning, but if we do still want to do that there are probably some improvements that can be made to how we detect and load those formulae.

Another option might be to follow up with the offline usage idea more. That way, we can avoid having weird edge cases where formulae that are expected to exist don't since they'll "exist" from the offline cache and can be downloaded on-the-fly as needed.

Rylan12 avatar Feb 11 '22 06:02 Rylan12

Thanks for writing this up, Rylan. Based on this: do you think we should avoid making it a GSOC project for the time being?

mistydemeo avatar Feb 14 '22 01:02 mistydemeo

No, I don't think I would say that. There is still clearly work that can be done here that would be beneficial to us, so I think that if a GSOC participant is interested in this project, they should be able to have the opportunity.

The reason progress has been slower than expected on this is mainly that I've had a lot less time this year than I expected to devote to it, but a GSOC participant should be able to devote more time here.

I think a good thing about this project is that it can be very flexible. It can start as simply adding brew info capability and testing the feature more extensively, and then if the participant has time and interest, they can start to think about and implement the improvements in my above comment.

Plus, depending on how it goes, the participant may be able to start working on the process of making this the default setting for new installations and migrating existing users to have this set by default (if we want).


In summary, here are some good reasons to open it up as a GSOC project: we want it to be done, the work should be manageable and is flexible, and it's a feature that will be directly useful to many Homebrew users.

There are two downsides to this as a GSOC project that I can think of. First, it requires familiarity with some more advanced parts of the Homebrew codebase (i.e. formula loading/installation/dependencies). Since there will be mentors who are familiar with these things, I think this will be manageable. And the only real way to become familiar with these is to actually poke around. Second, I guess there is a small chance that after looking at all the options for improvement (i.e. my above comment), we could choose not to make any changes. But, even if this happens, the other parts of the project would still be relevant.

Rylan12 avatar Feb 14 '22 03:02 Rylan12

No, I don't think I would say that.

Agreed 👍🏻

MikeMcQuaid avatar Feb 14 '22 09:02 MikeMcQuaid