brew icon indicating copy to clipboard operation
brew copied to clipboard

The cask.json API is missing the download url and checksum for ARM variants

Open marcprux opened this issue 2 years ago • 9 comments

  • [x] your issue is with the https://formulae.brew.sh website and not the (generated) contents of a given formula/cask page.

What you were trying to do (and why)

I would like to use the cask.json API to preview the binary that would be installed for a cask, so I run:

# curl -s https://formulae.brew.sh/api/cask.json | jq '.[] | select(.token == "brave-browser") | .url'
"https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable/134.81/Brave-Browser-x64.dmg"

What happened (include screenshots)

On an M1 MBP, and when I actually install the cask, it is different than the one listed by the API (which always lists the Intel download):

marc@zap ~ % brew upgrade brave-browser

==> Upgrading 1 outdated package:
brave-browser 1.34.80.0,134.80 -> 1.34.81.0,134.81
==> Upgrading brave-browser
==> Downloading https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable-arm64/134.81/Brave-Browser-arm64.dmg

This is because the cask at https://formulae.brew.sh/api/cask-source/brave-browser.rb provides a different URL for Intel and ARM.

What you expected to happen

I would like to be able to preview the binary that will be downloaded. I suspect the easiest (and most backward-compatible) way would be to just add new "url_arm64" and "sha256_arm64 " properties to the API that lists the ARM build (if any), so that https://formulae.brew.sh/api/cask/brave-browser.json would look like:

{
  "token": "brave-browser",
  "full_token": "brave-browser",
  "tap": "homebrew/cask",
  "name": [
    "Brave"
  ],
  "desc": "Web browser focusing on privacy",
  "homepage": "https://brave.com/",
  "url": "https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable/134.81/Brave-Browser-x64.dmg",
  "url_arm64": "https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable-arm64/134.81/Brave-Browser-arm64.dmg",
  "appcast": null,
  "version": "1.34.81.0,134.81",
  "versions": {},
  "installed": null,
  "outdated": false,
  "sha256": "f048a08cc5d2ea6f237079e35e4c9b3c2485cbbcb5f7e7f3894d939c1fb134a3",
  "sha256_arm64": "1586701718deb654b2fdc6a9b2e4d4dce1de198710f37ef711a8d8c62f126c12",
  …
  "generated_date": "2022-01-24"
}

marcprux avatar Jan 25 '22 00:01 marcprux

This is a feature request for Homebrew/brew really because that's what generates the JSON output.

A new field would need to be generated here, yes, but it also perhaps needs a new Cask DSL for being able to conditionally set and generate these values. I'd also suggest they be provided in a nested format (while preserving backwards compatibility) where url is dependent on the machine you run it on vs. the nested urls or whatever always output both.

MikeMcQuaid avatar Jan 25 '22 08:01 MikeMcQuaid

I wonder if the various url combinations could be automatically evaluated by iterating over the possible values for the templates that are commonly used in download URLs, spoofing those values while evaluating the cask, and then storing the evaluated url and sha256 properties for each combination along with the property variant values that yielded the unique download URL.

This would support not only URLs that vary based on the architecture, but also the ones that vary based on language and OS version, like in the following url stanzas:

https://desktop.figma.com/#{arch}/Figma-#{version}.zip
https://ftp.mozilla.org/pub/thunderbird/releases/#{version}/mac/#{language}/Thunderbird%20#{version}.dmg
https://downloads.omnigroup.com/software/MacOSX/10.13/OmniPresence-#{version}.dmg
https://downloads.omnigroup.com/software/MacOSX/10.14/OmniPresence-#{version}.dmg

These combinations could be expressed statically by the JSON API with a new variants property. For example, https://formulae.brew.sh/api/cask-source/figma.rb could yield a variants array in https://formulae.brew.sh/api/cask/figma.json like so:

{
  "token":"figma",
  …
  "url": "<existing intel URL>",
  "sha256": "<existing intel SHA256>",
  "variants": [
    {
        "arch": "arm",
        "url": "https://desktop.figma.com/mac-arm/Figma-108.1.0.zip",
        "sha256: "9920d4d9a4041f0c0b354b3bac0fa0d738b5dd441326dd114785e6b58de7142f"
    },
    {
        "arch": "intel",
        "url": "https://desktop.figma.com/mac/Figma-108.1.0.zip",
        "sha256: "a91e1ca5073f3d60f2a9f76ab805ca7e90c6c8c68b079c53f0c0ebd604b05710"
    }
  ]
}

In addition to the arch property, variants from different language and OS version could be specified with additional lang and os properties. For example, https://formulae.brew.sh/api/cask-source/thunderbird.rb would yield a variants property in https://formulae.brew.sh/api/cask/thunderbird.json like so:

{
  "token":"thunderbird",
  …
  "url": "<existing intel URL>",
  "sha256": "<existing intel SHA256>",
  "variants": [
    {
        "lang": "pt-BR",
        "url": "https://ftp.mozilla.org/pub/thunderbird/releases/91.5.1/mac/pt-BR/Thunderbird%20#91.5.1.dmg",
        "sha256: "1e6befc77a2386f8c08caaaf486d3b3bb2a25535cd38565ee21d704ea9542414"
    },
    {
        "lang": "pt",
        "url": "https://ftp.mozilla.org/pub/thunderbird/releases/91.5.1/mac/pt-PT/Thunderbird%20#91.5.1.dmg",
        "sha256: "ea43d50fcfd8bc1b56a76fc07af606310d94ed2d7217475b467c3b53a141b432"
    },
   …
    {
        "url": "https://ftp.mozilla.org/pub/thunderbird/releases/91.5.1/mac/en-US/Thunderbird%20#91.5.1.dmg",
        "sha256: "736c3a41ab71f13c2002b54a25d223c147cc3375b3e61cd7f3247c5c9d791186"
    }
  ]
}

This would enable the cask API to declaratively enumerate all the possible download URLs and checksums for any given cask without needing any additions to the cask DSL itself. I expect this could be useful for a number of different applications, such as availability checking and malware scanning. In my case, I just want to pre-flight the binary that would be installed using HOMEBREW_INSTALL_FROM_API=1 brew install --cask caskname without having to first tap the cask and clone the repository.

marcprux avatar Feb 01 '22 18:02 marcprux

These combinations could be expressed statically by the JSON API with a new variants property. For example, https://formulae.brew.sh/api/cask-source/figma.rb could yield a variants array in https://formulae.brew.sh/api/cask/figma.json like so:

Something like this would make sense to me as an API, yeh 👍🏻

MikeMcQuaid avatar Feb 02 '22 13:02 MikeMcQuaid

What if the variants were part of the request url?

e.g.,

https://formulae.brew.sh/api/x86_64/cask/figma.json
https://formulae.brew.sh/api/aarch64/cask/figma.json

That way the json could be generated from the relevant machine, cached and served up from S3 for example.

ldeck avatar May 16 '22 13:05 ldeck

@ldeck good thinking but ideally the JSON would encapsulate all variants (for both formulae and casks) and could be generated on any architecture or OS.

MikeMcQuaid avatar May 16 '22 20:05 MikeMcQuaid

Since people are discussing this: I took a stab at implementing an all-variants evaluation of the cask, and I determined it would be a lot of work to do this exhaustively. Since the cask's url can be calculated from any ruby expression, there's no way to truly check all the possibilities, but even just iterating through all the possible language options (which some casks use for locale-specific installers) would mean that to generate the complete JSON for a single cask with all the possible variants would require dozens of iterations of cask re-evaluation with the spoofed language variable. Another common variant is the target os version (e.g., for a High-Sierra-specific release), which adds another dimension of variants.

It looked to me like it would require a lot of internal refactoring. For example, all the common environment accessors used by the cask formula would need to become "spoof-able", so the code that runs through the possible variants would be able to setup the environment. It seems like it could certainly be done, just not by someone with my limited ruby skills.

A quicker solution would be to just take @ldeck's suggestion, but as far as I know GH actions is only available on Intel Macs, so we'd still need a way to spoof the hardware environment for the purposes of API generation.

marcprux avatar May 16 '22 21:05 marcprux

A quicker solution would be to just take @ldeck's suggestion, but as far as I know GH actions is only available on Intel Macs, so we'd still need a way to spoof the hardware environment for the purposes of API generation.

Yes, we probably need a new DSL for this.

We moved away from if OS.linux? and if OS.mac? to on_macos and on_linux blocks in formulae for this reason. We should do the same thing in casks and for ARM/different architectures in both formulae and casks.

MikeMcQuaid avatar May 16 '22 21:05 MikeMcQuaid

We moved away from if OS.linux? and if OS.mac? to on_macos and on_linux blocks in formulae for this reason. We should do the same thing in casks and for ARM/different architectures in both formulae and casks.

Interesting. I'm catching up on the history and discussion at https://github.com/Homebrew/brew/issues/11687#issuecomment-878128884.

I guess the problem is that the declarative API will never be able to exhaustively enumerate all of the possible cases for the url and checksum, since even with a custom DSL for some of the common variants (like arm/intel, language, and os version), it won't be able to derive all the possible values of those properties since they still depend on arbitrary ruby expressions.

The way I'm handling this in appfair.app now is to just manually download the ruby spec for any cask and pre-flight by running brew info locally to determine the authoritative URL and checksum, which the app then downloads to the expected cache location before invoking brew install. It works, but it would sure be nicer if the actual URL could be derived declaratively.

marcprux avatar May 17 '22 20:05 marcprux

I guess the problem is that the declarative API will never be able to exhaustively enumerate all of the possible cases for the url and checksum, since even with a custom DSL for some of the common variants (like arm/intel, language, and os version), it won't be able to derive all the possible values of those properties since they still depend on arbitrary ruby expressions.

It doesn't need to handle all possible cases from arbitrary Ruby in the DSL, it just needs to be able to handle architecture and OS differences. Perfect is the enemy of good 😁

MikeMcQuaid avatar May 18 '22 18:05 MikeMcQuaid

@Rylan12 was this completed?

MikeMcQuaid avatar Feb 16 '23 14:02 MikeMcQuaid

Yes, the information is now available on formulae.brew.sh. However, it's worth noting that the original curl query from this issue won't work since the specific version being requested might be buried in the variations hash.

Instead, if you need information like this, you should use brew info --json=v2:

$ brew info --json=v2 brave-browser | jq '.casks[0].url'
"https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable-arm64/148.164/Brave-Browser-arm64.dmg"

$ curl -s https://formulae.brew.sh/api/cask.json | jq '.[] | select(.token == "brave-browser") | .url'
"https://updates-cdn.bravesoftware.com/sparkle/Brave-Browser/stable/148.164/Brave-Browser-x64.dmg"

Rylan12 avatar Feb 16 '23 17:02 Rylan12