TileDB
TileDB copied to clipboard
Single release CMake file that contains all required links/hashes
This PR aims to autogenerate cmake file containing all required hash/url combinations so that other repositories can simply pull this file and forget about changing all url/hash occurences in their code. The output assets look like this now:
https://github.com/dudoslav/TileDB/releases/tag/t01
Please let me know if the produced CMake file is ok, or should be changed to a different format.
TYPE: IMPROVEMENT DESC: Look above
This pull request has been linked to Shortcut Story #17775: Generate cmake file with artifact URLs and checksums for each release.
Love this. Having the information in one place (by release) is very good. But may I raise my hand and ask for a (complementary) non-cmake format? Maybe JSON? Maybe key:value pairs? R does use cmake and while I could shimmy it (with help) a simpler approach would be good too.
I have no problem also adding a json format, however, do you really need release build outside cmake?
I can read JSON from CMAKE without any issues since I can use string(JSON ...)
functions
Yes, good point @eddelbuettel
I have no problem also adding a json format, however, do you really need release build outside cmake?
Yes, these builds are used by TileDB-R outside of CMake.
To be perfectly plain, I would prefer non-JSON and reading JSON happens to not be batteries included. I currently use the 'Debian Control File' format R itself uses in its DESCRIPTION file, for that we have a ready-made parser included (read.dcf()
that is):
https://github.com/TileDB-Inc/TileDB-R/blob/e17563a554921d952ac473d5e49b65888a6a035a/tools/tiledbVersion.txt#L1-L2
Of course, other formats work but at the cost of added dependencies. So JSON, YAML, TOML, ... are all equally (in)convenient with the need for other library.
do you really need release build outside cmake?
Yeah. Let's plan on a quick huddle this week and I show you a quick demo and more resources. In essence, R is driven by e.g. R CMD INSTALL nameOfArchve_x.y.z.tar.gz
and invokes just make
and other standard tools (via pre-made added Makefile
snippets). One can farm out to cmake
and other callable tools but it counts as extra dependency. We currently only depend on cmake
if a build from (full, ie TileDB) source is requested else R and shell etc lead to the download of an artifact.
Ok so my next plan is to:
- Create CSV file of all releases + source
- Change sha1 to sha256 if its ok with everyone (because it might create more problems downstream)
- Modify cmake to use this csv file
- Divide cmake function into download_prebuilt and download_source
This might be a format that could be compatible with both R and CMake:
(base) dudoslav@beetle:~$ cat rellist.csv
os,arch,url,sha,noavx2
LINUX,X86_64,https://SOMEURL.com,da39a3ee5e6b4b0d3255bfef95601890afd80709,
LINUX,X86_64,https://SOMEURL.com,da39a3ee5e6b4b0d3255bfef95601890afd80709,NOAVX2
WINDOWS,X86_64,https://SOMEURL.com,da39a3ee5e6b4b0d3255bfef95601890afd80709,
DESC: Look above
The item in the DESC
field will be included in the release notes. You should set it to something meaningful like Add redistributable CMake file that downloads prebuilt binaries from GitHub Releases
.
Here's how I am imagining this. Each release will have a JSON (or another format like CSV but I would prefer JSON IMHO) file named say artifacts.json
that will contain information about the available release artifacts. Here's an example for 2.19.0:
{
// We could add this to help non-vcpkg builds from source.
"git-ref": "fa30a88abc5504ed8387e25937744afcd3802232",
"artifacts": {
"x64-windows": {
"filename": "tiledb-windows-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-linux": {
"filename": "tiledb-linux-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-linux-noavx2": {
"filename": "tiledb-linux-x86_64-noavx2-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"x64-macos": {
"filename": "tiledb-macos-x86_64-2.19.0-fa30a88a.zip",
"sha512": "…"
},
"arm64-macos": {
"filename": "tiledb-macos-arm64-2.19.0-fa30a88a.zip",
"sha512": "…"
}
}
}
By reading from the JSON file, the CMake script can be version-agnostic and be checked in once for every downstream repository. Here's how the script could be used:
include(DownloadPrebuiltTileDB)
# Downloads prebuilt binaries of TileDB from GitHub
# Releases and calls find_package to import them.
find_prebuilt_tiledb(
# The version of TileDB to use. Required.
VERSION 2.19.0
# The hash of the artifacts.json file. Optional, will warn if not
# specified, maybe suggesting to at least enable CMAKE_TLS_VERIFY.
MANIFEST_SHA512 …
# The name of the release artifact to download.
# Optional, will try to autodetect if not specified.
ARTIFACT_NAME x64-linux-noavx2
# Flags that will also be passed to the find_package call.
REQUIRED QUIET
)
target_link_libraries(my_project PRIVATE TileDB::tiledb)
Regarding building from source I am wondering if it would be a better idea to have this CMake script exclusively deal with downloading prebuilt binaries, and direct users that want more customizability to vcpkg (after we create a tiledb
port, which can be done at any time). Scoping this out will significantly simplify the script.
@teo-tsirpanis I really like your suggestion, but there is a requirement from the R team for a format that could be easily included without dependencies. So JSON is somewhat out of question. Other than that, the CMake script will contain two functions:
fetch_release_list
that downloads and parses the release list (rellist.csv
)
download_source_tiledb
that downloads and build source tiledb
download_prebuilt_tiledb
that downloads prebuilt tiledb
The question is that if we don't need download_source_tiledb
then I could get rid of fetch_release_list
function. However, are we really sure we don't need that source tiledb distribution?
The R build would need to load another package just to parse JSON, there is no built-in reader. Csv and alike work fine.
Another idea, because we are not short those few bytes, would be to store both sha1 and sha256 checksums.
Also, I am still not sure where to put this download logic, since it will not change between releases. Only the rellist changes. Maybe if I would hardcode the latest version into it it could be also distributed as release asset otherwise I am not sure what to do with it
Ok, so what do you guys think about the latest changes and mostly about the DownloadPrebuiltTileDB.cmake
file? I removed the source distribution logic as requested and add parameters to that CMake script.
Edit 1.: As always the produced CSV can be found here: https://github.com/dudoslav/TileDB/releases/download/t01/rellist.csv
Edit 2: Fixed link
Edit 2: Change link directly to CSV
As always the produced CSV can be found here:
That's a cmake file, not a csv file.
Edit two weeks later: Not sure what I was up to in that comment. Hm.
Revisiting my comment from two weeks ago: Hm. I must have been confused or in a rush or both. rellist.csv
is very much a csv as discussed:
edd@rob:~/Downloads$ cat rellist.csv
os,arch,url,sha,noavx2
LINUX,X86_64,https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-linux-x86_64-noavx2-t01-3ecfe85.tar.gz,298c65001b39629974613611572bf9033755431c,NOAVX2
LINUX,X86_64,https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-linux-x86_64-t01-3ecfe85.tar.gz,90b1a8c25998422a5b8ec1cea12c86d97ff14a07,
MACOS,ARM64,https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-macos-arm64-t01-3ecfe85.tar.gz,6813c0b469680630c09545359531b8140ede3b61,
MACOS,X86_64,https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-macos-x86_64-t01-3ecfe85.tar.gz,213fa27cbef480c528e7128928a61f5391d49fc3,
WINDOWS,X86_64,https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-windows-x86_64-t01-3ecfe85.zip,e6ba34bafa07738ccc0a3da14e106bf97e51a82d,
source,source,https://api.github.com/repos/dudoslav/TileDB/zipball/t01,577c693aa189569dab6ddc51f74a98d7bb459987,source
edd@rob:~/Downloads$
os,arch,url,sha,noavx2
I think a CSV schema of id,url,sha
where id
can have values such windows-x64
, linux-x64-noavx2
or source
is more scalable.
And won't we use SHA256 or greater?
I like column headers 'os', 'arch', 'url' and 'sha(1)'. We could fold the noavx2 into the arch column making it x86_64 and x86_64_noavx as a specialisation but it is .. arguably easier to have it split in an extra column even of the csv file is 'ragged':
> D <- fread("rellist.csv") # from data.table which pretty-prints as below
> D
os arch url sha noavx2
<char> <char> <char> <char> <char>
1: LINUX X86_64 https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-linux-x86_64-noavx2-t01-3ecfe85.tar.gz 298c65001b39629974613611572bf9033755431c NOAVX2
2: LINUX X86_64 https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-linux-x86_64-t01-3ecfe85.tar.gz 90b1a8c25998422a5b8ec1cea12c86d97ff14a07
3: MACOS ARM64 https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-macos-arm64-t01-3ecfe85.tar.gz 6813c0b469680630c09545359531b8140ede3b61
4: MACOS X86_64 https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-macos-x86_64-t01-3ecfe85.tar.gz 213fa27cbef480c528e7128928a61f5391d49fc3
5: WINDOWS X86_64 https://api.github.com/repos/dudoslav/TileDB/releases/136678694/assets/tiledb-windows-x86_64-t01-3ecfe85.zip e6ba34bafa07738ccc0a3da14e106bf97e51a82d
6: source source https://api.github.com/repos/dudoslav/TileDB/zipball/t01 577c693aa189569dab6ddc51f74a98d7bb459987 source
>
Other columns (sha356, md5, ...) can easily be added.
PS Not sure about the last line though.
UPDATE: I had to rebase this branch on top of another branch that add CPACK releases, since those ones actually generate sha256 hashes. This means that this PR should be merged after that change: https://github.com/TileDB-Inc/TileDB/pull/4694
Here are proposed release assets: https://github.com/dudoslav/TileDB/releases/tag/t03
I changed download logic to use FetchContent and add ReleaseListGeneration into Release workflow. This is the resulting artifacts:
https://github.com/dudoslav/TileDB/releases/tag/t09
And this is the pipeline that produced it:
https://github.com/dudoslav/TileDB/actions/runs/8110756605
I also downloaded DownloadPrebuiltTileDB.cmake
and tested in local dummy project.