fontsource
fontsource copied to clipboard
Report font license in NPM package, not (only) MIT
The NPM package, e.g. for poppins (https://github.com/fontsource/fontsource/blob/master/packages/poppins/package.json), states the license is MIT and the author is "Lotus".
This might be correct for the NPM package itself, but it is not the license of the font. That would be OFL-1.1 and the author is Google.
I would like to suggest to modify the information in package.json accordingly so that it accurately reports the license of the font itself. This would allow license extractors like https://github.com/pivotal/LicenseFinder to detect it properly. In my case, we include information extracted this way in the "About" box of an application to comply with the licenses.
So instead of having something like this:
"author": "Lotus <[email protected]>",
"license": "MIT",
"homepage": "https://github.com/fontsource/fontsource/tree/master/packages/poppins#readme",
It would either be the font information itself
"author": "Google Inc.",
"license": "OFL-1.1",
"homepage": "https://fonts.google.com/specimen/Poppins",
Or if you want to include both it could be something like
"author": "Google Inc. (https://fonts.google.com/specimen/Poppins)",
"contributors": ["Lotus <[email protected]>"],
"license": "(MIT AND OFL-1.1)",
"homepage": "https://github.com/fontsource/fontsource/tree/master/packages/poppins#readme",
All actual licensing for the font specifically is shown via the README of each package.
In fact, I don't even know how to pull the specific license information for each font. I simply link to this page for any Google Font packages.
I understand where you're coming from and I'd be happy to make changes accordingly, if we can pull the licensing info accordingly since Google Fonts vary from OFL, Apache and Ubuntu.
I guess that hardest part will be to figure what the license is for each font, right? Because it doesn't look like the font-specific directories contain that information.
Maybe a script that runs through https://fonts.google.com/attribution and matches it up with the directories here would be able to handle 99% of the fonts and the rest could be edited manually.
How often do you add new fonts? And how many of the fonts are actually the Google ones?
Scraping is finicky and subject to change if Google updates the design of that page, but also doable.
Since we automatically update and mirror all Google Fonts on a weekly basis, we'd need to scrape the page weekly as well in order to keep up to date with any additions or changes.
Any Google Fonts without an identifiable license can be flagged for manual review.
It won't be too much of an issue for non-Google fonts since we already store their license links in their metadata.json. Although it'll still need some manual work since we only store the link to the license and not the actual type (OFL, etc.) in order to match your earlier specification.
You can refer to FONTLIST.md to see all the non-Google fonts we have. It's a small percentage of the total count.
An alternate solution is using GitHub's API to gather the folder structure of the google/fonts repository. All fonts are stored in folders with their relative licenses (apache, ofl, ufl) and we can work it out from there. I'd also want to mirror their repository folder structure maybe, which would require some minor Lerna changes.
It may be more convenient for us to do that since I do want to collect DESCRIPTION.en_us.html
for #145.
That sounds like a great idea! Using the Google Fonts git repository directly will be much easier to parse and maintain. The license info is right there, and they even have the customized OFL.txt file that contains the font/author name, which can be placed as a file called LICENSE in the directory (see https://github.com/pivotal/LicenseFinder#a-plea-to-package-authors-and-maintainers).
Maybe consider not using the GitHub API but simply require it to be checked out in a certain path, or maybe git submodule it, depending on where you run that weekly scrape job.
Git submodules look super interesting (haven't personally heard of it previously). It might be a little too complicated and messy though 👀
Changing the folder structure of the whole project to mirror the Google Fonts repo might be a little much though since we still have non-Google fonts hanging about too.
Git submodules are not the tool you want to reach for here. Take if from somebody who has used them daily for a decade or so (from whenever the feature was first added). They are good for some things but a pain to manage, and trying to place a giant monorepo that has it's own management issues inside a submodule will be nothing but pain for no gain.
Honestly making API calls to get the most relevant files off GitHub will probably be the most robust approach (caveat: use authenticated requests or you will get bounced by the API limiter). On the other hand just having a variable that points to a path that is expected to be a checkout would probably be the easiest way to get started.
Marked as breaking as we might need to modify metadata.json
in a breaking way to accommodate this. Slated for the 5.x release.
I'm going to integrate this upstream into google-font-metadata and use the scrapers there to pull license info from https://fonts.google.com/attribution. I think that's the best solution to get individual author names and whatnot. It would then be easy to get license information added to all packages from there.
Upstream implementation complete in https://github.com/fontsource/google-font-metadata/pull/107 🎉
Released in v5 🎉