chore(scripts/termux_step_install_license): total refactor
Part of:
- #20858
This PR is a total refactor of scripts/build/termux_step_install_license.sh.
The goal was to eliminate shellcheck warnings and reduce jank.
I'd like some clarification on this check in the current version. https://github.com/termux/termux-packages/blob/eeca771cecf05365533399c1bfeb264321548d6a/scripts/build/termux_step_install_license.sh#L81-L85
I can't see what it is meant to accomplish.
It seems pretty safe to say that it is intended to loop over all licenses installed for the package.
But the if condition doesn't make any sense.
- 1.) The
*is quoted, and thus not a wildcard. - 2.) The actually check is if
$LICENSEis equal to the path where the package licenses are stored. Which would always be true, if it weren't for the mistakenly quoted*, making it never true unless there is a license calledLICENSE*which will never be the case.
Since I don't see any way for that check to ever fail, I removed it. If we do wanna account for all of the license files, I can rewrite the check. Though that accounting is already done by the rest of the script above.
All initial smoke tests seem favorable.
I've rebuilt most of the root repo locally with the refactored license installation step changes.
The check is to verify so that there are files in $TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME. If no licenses were installed for some reason, for example if TERMUX_PKG_LICENSE is empty, then LICENSE* will not expand and remain LICENSE*, and the check catches that.
We can probably check this in a more intuitive way, maybe with find.
find would definitely be more intuitive.
Looks fine otherwise from a quick look.
However, note that debian does not use LICENSE as file names, it uses (a single) copyright, so would be better to shift to that.
- https://www.debian.org/releases/stable/i386/ch01s08.en.html
Additionally, lot of packages use the machine-readable debian/copyright file on debian, including ruby, zsh, etc. It puts all the license texts under a single file. You can find those packages on the system with grep -r -l -F "https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/" /usr/share/doc | sort. I personally don't like the part about putting all the license texts under the same file as that's too verbose and harder to understand, so I break the spec and use License: [<license_name>](<path_to_license_text_file>) instead of the License: <license_name>\n<license_text>, hopefully a future spec version adds support for that.
- https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
- https://lists.debian.org/debian-project/2010/08/msg00269.html
- https://wiki.debian.org/Proposals/CopyrightFormat
- http://ftp.debian.org/debian/pool/main/
- https://askubuntu.com/a/620069
Well if you're in the mood to break APIs.
I did briefly think about replacing the $COUNTER suffix with the $LICENSE identifier.
So it would be LICENSE.MIT and LICENSE.HPND,
but I dropped that idea because of some of the less filename friendly license idenifiers.
And maintaining API compatibility with the previous version of the license install step.
Looks fine otherwise from a quick look.
However, note that debian does not use
LICENSEas file names, it uses (a single)copyright, so would be better to shift to that.
- https://www.debian.org/releases/stable/i386/ch01s08.en.html
Additionally, lot of packages use the machine-readable
debian/copyrightfile on debian, includingruby,zsh, etc. It puts all the license texts under a single file. You can find those packages on the system withgrep -r -l -F "https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/" /usr/share/doc | sort. I personally don't like the part about putting all the license texts under the same file as that's too verbose and harder to understand, so I break the spec and useLicense: [<license_name>](<path_to_license_text_file>)instead of theLicense: <license_name>\n<license_text>, hopefully a future spec version adds support for that.
- https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
- https://lists.debian.org/debian-project/2010/08/msg00269.html
- https://wiki.debian.org/Proposals/CopyrightFormat
- http://ftp.debian.org/debian/pool/main/
- https://askubuntu.com/a/620069
I've had a read through the linked resources. And while I think we might wanna consider that, I think it's out of scope for this PR. I'm also fairly sure it would necessitate rebuilds of all packages, no?
For checking if license file was added.
license_files="$(find "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 -type f -name "LICENSE*")"
if [ -z "$license_files" ]; then
# error out
fi
For mulitple filenames, you can use something like \( -name "LICENSE*" -o -name "copyright*" \).
For checking if license file was added.
license_files="$(find "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 -type f -name "LICENSE*")" if [ -z "$license_files" ]; then # error out fiFor mulitple filenames, you can use something like
\( -name "LICENSE*" -o -name "copyright*" \).
Should that be a find -L?
I guess we don't really care if its a symlink or not, so no?
I don't suggest using the machine-readable debian/copyright format right now, I just wanted to notify that the standard exists, and that it solves quite a few issues, especially in relation to which source files are under what license and the format being machine readable.
However, I do suggest that we use copyright and copyright.<counter> format right now as that is the standard which wouldn't be hard to follow and would provide consistency with tools processing package license files. Packages can get rebuilt whenever they are.
However, I do suggest that we use
copyrightandcopyright.<counter>format right now as that is the standard which wouldn't be hard to follow and would provide consistency with tools processing package license files. Packages can get rebuilt whenever they are.
Yep, that's definitely far easier than implementing the whole standard right away. Done locally now.
Packages can get rebuilt whenever they are.
Eventual consistency, eh? :P
Should that be a find -L?
Yeah, best use \( -type f -o -type l \) instead.
Thanks.
Eventual consistency, eh? :P
Well, if you wanna rebuild all 2000+ packages, who am I to say no, have fun with failed builds and missing source urls :p
Well, if you wanna rebuild all 2000+ packages, who am I to say no, have fun with failed builds and missing source urls :p
CI's been flakey lately anyway. I sure as hell am not poking it with the "full repo rebuild" stick.
Alright, that's:
copyright.${counter}as license name- adding
copyrightto the search list - turning the license filepath into a variable
- accursed hashmap magic
If you still want the find check for making sure all the licenses are present and accounted for I'd appreciate it if you could provide me a cleaned up version of how you want that to work.
All other threads should be resolved now.
So the find check is the last blocker here.
# Will match regular files and symlinks to regular files
$ find -L . -maxdepth 1 -type f
# Will match regular files, symlinks to regular files and broken symlinks
$ find -L . -maxdepth 1 \( -type f -o -type l \)
# Will match regular files, symlinks to regular files, symlinks to directories and broken symlinks
$ find . -maxdepth 1 \( -type f -o -type l \)
Based on above, best use following. It will not match broken symlinks to license files, assuming that's what we want.
license_files="$(find -L "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 -type f -name "copyright*")"
if [ -z "$license_files" ]; then
# error out
fi
Based on above, best use following. It will not match broken symlinks to license files, assuming that's what we want.
license_files="$(find -L "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 -type f -name "copyright*")" if [ -z "$license_files" ]; then # error out fi
Yep I think that tracks with what we are actually trying to check.
I think license file symlinks may not be valid during build time, let me check.
Yup, they wouldn't exist, use following with \( -type f -o -type l \).
license_files="$(find -L "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 \( -type f -o -type l \) -name "copyright*")"
if [ -z "$license_files" ]; then
# error out
fi
By symlinks not being valid, I meant that termux-licenses wouldn't exist at ../../../../share/LICENSES/${LICENSE}.txt at build time and only in $TERMUX_SCRIPTDIR/packages/termux-licenses/LICENSES/${LICENSE}.txt. They would be valid on the device though.
Going with
local license_files
license_files="$(find -L "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 \( -type f -o -type l \) -name "copyright*")"
[[ -n "$license_files" ]] || {
termux_error_exit "No LICENSE file was installed for $TERMUX_PKG_NAME"
}
which is semantically identical.
which is semantically identical.
How dare you!
How dare you!
I think the || more clearly identifies this as an exception path.
Yeah, yeah, it's fine.
This would only catch all license files missing. Am I correct in that assumption?
Just wondering if we wanna do something about individual missing licenses.
We can probably co-opt the $COUNTER for checking if we have as many licenses as expected.
Although, the only way I can see that happening is if a specified license file is no longer present in the source repo.
In which case the cp of the file would fail.
Thus catching the issue earlier.
Though I think that would silently error.
The cp would error with a message along the lines of:
cp: cannot stat 'License.no.such.file': No such file or directory
Which works just fine.
All good from me then.
Yes, check would pass if at least one license file exists.
If you wanna ensure no missing licenses, in case, a specific loop never ran to fail for cp/ln, etc, then use
local license_list
if [[ -n "${TERMUX_PKG_LICENSE_FILE}" ]]; then
license_list="$TERMUX_PKG_LICENSE_FILE"
else
license_list="$TERMUX_PKG_LICENSE"
fi
...
license_files="$(find -L "$TERMUX_PREFIX/share/doc/$TERMUX_PKG_NAME" -maxdepth 1 \( -type f -o -type l \) -name "copyright*")"
if [ -z "$license_files" ]; then
# error out
fi
if [ "$(echo "$license_files" | wc -l)" != "$(echo "$license_list" | sed -E -e 's/[ \t]//g' -e 's/,/\n/g' | wc -l)" ]; then
# error out
fi
I think I'll just be satisfied with cp's error message if it can't find the license file in the source repo.
And ln is only invoked for "known generic licenses",
so it should be safe to assume those will not be missing.