fontist icon indicating copy to clipboard operation
fontist copied to clipboard

Update the Google Fonts indexing mechanism

Open ronaldtse opened this issue 2 years ago • 8 comments

I think we have two problems to deal with:

  1. Fetching and indexing Google Fonts into Formulas
  2. Allowing Fontist users to directly download fonts from Google Fonts in a non-interactive way

For 1, the best choice has to be directly parsing the google/fonts GitHub repository. Changes there may be more frequent than their new Developer API, but also means we can respond to them faster. It would mean we need to parse font metadata ourselves, but that's also reasonable because we need that functionality for fonts sourced outside of Google Fonts.

For 2, users have two needs with Google Fonts:

  • Downloading the font file locally. We can link people to the Google Fonts GitHub.
  • Providing an HTTP link for web rendering. This we don't serve yet, and we will need to have the WOFF-type link to make this happen. The WOFF file and link are not provided in the GitHub repository, so we will need to use the Developer API.

Conclusion:

  1. For most of Formula information, we can populate from their GitHub repository, including for local font download.
  2. For public usage of the font we will need to use the API for every font to obtain the public-facing font file links and WOFF links.

Regarding 1:

Interestingly, their metadata file (METADATA.pb in text protobuf) looks eerily similar to our Formula file:

  • https://github.com/google/fonts/blob/main/ofl/abhayalibre/METADATA.pb

Except that they have:

  • per font weight
  • designer
  • category: SERIF, SANS_SERIF, DISPLAY, MONOSPACE, ...
  • primary_script
  • subsets
  • optional classifications
  • optional stroke
  • optional source: lists where the repository/commit the font file and/or archive is from. If there is source they have an upstream.yaml file that lists out:
    • for repository: which branch, location of the license file, location of the TTF file inside that source repository.
    • for archive: URL and location of the license and TTF files
    • NOTE: They have a PR that merges upstream.yaml into METADATA.pb

They also have DESCRIPTION.{lang}.html that provides a description of the font, in HTML. Perhaps we should also supply this on the Formulas site.

Maybe having this information would also be a good choice for Fontist...

Originally posted by @ronaldtse in https://github.com/fontist/fontist/issues/367#issuecomment-2035897013

ronaldtse avatar Apr 04 '24 01:04 ronaldtse

@alexeymorozov I have added the secret FONTIST_CI_GOOGLE_FONTS_API_KEY in the Fontist GitHub organization to allow for using the Google Fonts API.

ronaldtse avatar Apr 04 '24 01:04 ronaldtse

This is now holding up release, so it is urgent.

opoudjis avatar Apr 09 '24 07:04 opoudjis

Just to keep everyone updated on the progress: in the PR, installation works now with lots of formulas even without reindexing formulas. The current issue is how to properly reindex so installation works with all formulas.

For example, the Trispace font contains only one font in the repo but the site and API returns 8 fonts, so indexing of the repo would "miss" 7 extra fonts:

$ curl "https://www.googleapis.com/webfonts/v1/webfonts?family=Trispace&key=KEY123"
{
  "kind": "webfonts#webfontList",
  "items": [
    {
      "family": "Trispace",
      "variants": [
        "100",
        "200",
        "300",
        "regular",
        "500",
        "600",
        "700",
        "800"
      ],
      "subsets": [
        "latin",
        "latin-ext",
        "vietnamese"
      ],
      "version": "v24",
      "lastModified": "2023-08-25",
      "files": {
        "100": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbH9qoQl0zHugpt0.ttf",
        "200": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbP9roQl0zHugpt0.ttf",
        "300": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbCFroQl0zHugpt0.ttf",
        "regular": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbH9roQl0zHugpt0.ttf",
        "500": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbE1roQl0zHugpt0.ttf",
        "600": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbKFsoQl0zHugpt0.ttf",
        "700": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbJhsoQl0zHugpt0.ttf",
        "800": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbP9soQl0zHugpt0.ttf"
      },
      "category": "sans-serif",
      "kind": "webfonts#webfont",
      "menu": "https://fonts.gstatic.com/s/trispace/v24/Yq65-LKSQC3o56LxxgRrtA6yBqsrXL5GI5KI-IUZVGsxWFIlbH9rkQh-yA.ttf"
    }
  ]
}

And the current formula (that was created from the archive of an old API) contains even 35 fonts. Those ones seem not to be downloadable anymore.

alemorozov avatar Apr 19 '24 12:04 alemorozov

Thank you @alexeymorozov . It looks like the "Trispace" font has 5 variants in the archive/our formula:

  • "Trispace": Regular / Bold / Italics / Light / Thin / ExtraBold / ExtraLight / Medium / SemiBold
  • "TrispaceCondensed"
  • "TrispaceSemiCondensed"
  • "TrispaceExpanded"
  • "TrispaceSemiExpanded"

I wonder if Google Fonts simply split it into 5 fonts?

ronaldtse avatar Apr 20 '24 00:04 ronaldtse

@alexeymorozov regarding Trispace: all of the TTF files exist in the source repo: https://github.com/Etcetera-Type-Co/Trispace/tree/master/fonts/static/ttf

Screenshot 2024-04-20 at 15 00 54

In the Google Fonts repo, it seems that they have some issue with data accuracy:

  • https://github.com/google/fonts/blob/8c9a9848eaf9e3f7ccc7502fef84936ec1ca5650/ofl/trispace/METADATA.pb#L11

On this line it says the post_script_name: Trispace-Thin, which is clearly untrue.

However, I downloaded Google Font's Trispace font: https://github.com/google/fonts/blob/main/ofl/trispace/Trispace%5Bwdth%2Cwght%5D.ttf

It does contain all the weights for the "regular" font, but it does not contain the Condensed/SemiCondensed/Expanded/SemiExpanded fonts: Screenshot 2024-04-20 at 15 04 09

I think either we create a PR to Google Fonts that adds these 4 fonts, or just ignore this problem for now.

ronaldtse avatar Apr 20 '24 07:04 ronaldtse

I've reported this problem here:

  • https://github.com/google/fonts/issues/7585

ronaldtse avatar Apr 20 '24 07:04 ronaldtse

@ronaldtse Thank you for the help 🙏 I will use this approach.

alemorozov avatar Apr 23 '24 12:04 alemorozov

While I'm finishing with the import from the google/fonts repo, I propose we also do the import from the API.

API contains 6129 font files, at least 2945 of them are additional to the repo. In the repo there are 3377 font files.

Another thing - for purposes of downloading from API, checking the font files in the API may provide additional accuracy, because files in the repo and in the API may be different even when have the same style/weight. E.g. in the repo a file may be a variable one but in the API it is a regular one.

API data: https://www.googleapis.com/webfonts/v1/webfonts?key=API_KEY

I'm attaching files containing lists of fonts in the repo and the API: repofonts.json apifonts.json

alemorozov avatar May 10 '24 14:05 alemorozov

Released v1.21.1 with the feature. New formulas are on the v4 branch so I propose we make it the default one.

After users update to the new fontist, in order for the feature to work it's needed to run fontist update.

alemorozov avatar May 31 '24 20:05 alemorozov

@alexeymorozov please feel free to close this issue when complete, thanks!

ronaldtse avatar May 31 '24 23:05 ronaldtse