nixos-search
nixos-search copied to clipboard
Update cron-flakes.yml with new specified repos
This PR aims to add a rust command flake-repos and a workflow update-repos.yml to make cron-flakes.yml and flake-info search for up-to-date flakes. This issue has been worked on collaboratively by @bryanhonof and myself as part of the Summer of Nix, and was offered to us by @ysndr.
flake-repos has three arguments. The first is an input TOML file (the repos.toml file in its directory) that contains a list of parent repos (e.g. NixOS, ngi-nix, tweag, ect.) and their repo type (github, gitlab). The second is an output directory (for now the flakes directory) where a TOML of each repo in the input file is created, containing a list of child repos that have flakes in them. The last argument takes a workflow file (i.e. cron-flakes.yml) to be updated with the new repos stored in the output path. flake-repos uses the Rust crate reqwest to create a GitHub API query for fetching the repos at the moment, though it wouldn't be hard to add queries for other repo types as well.
update-repos.yml basically just uses flake-repos after checking out nixos-search and the pushes the changes to the flakes directory and cron-flakes.yml workflow to the main branch of nixos-search.
As of yet I was able to have update-repos.yml successfully use flake-repos and update the fork, and have cron-flakes.yml output some of the flakes successfully (although there seems to be issues with flake-info ATM that causes errors to occur in the jobset, canceling cron-flakes.yml). There are some issues that still need to be clarified/mentioned.
Issues
- Currently the
onfield forupdate-repos.ymlis set toworkflow_dispatch(mostly for testing purposes), how should it actually be handled? - For
update-repos.ymlI needed to create my own PAT secret that hadworkflowpermissions, sinceflake-reposdirectly modifies thecron-flakes.ymlworkflow. Is it reasonable to assume that this can be added or that the existing token can be update to include this permission? Or should there be a different approach to updating the files and directories altogether? - Because of how the
serde_yamlcrate deserializes YAML values to Strings, wheneverflake-reposupdatescron-flakes.yml'sgroupfield the formatting for the file messes up a bit. It's still functional but the spacing and newlines are effected. - Should there be an additional argument accepting a query or a list of queries in order to allow searching for different repo types? Or should each query for each repo type be hard coded like the github one is currently?
- When multiple repo types are implemented, if two of the same repos but of different repo types are found, should one be considered a duplicate and be removed? If so which repo type would recieve priority to stay?
There are certainly other issues, so if you have questions or reservations on anything (name choices, feature implementations, ect.) please feel free to bring them up so that they can be addressed and fixed.
@ysndr I see that you approved the changes, would you like me to change it from a draft to a proper PR to merge? Or are you still looking over the changes in the PR?
Hey @sophrosyne97 Coming in a bit late.. yes, please go ahead and undraft this PR.
Hey @sophrosyne97 Coming in a bit late.. yes, please go ahead and undraft this PR.
No problem, will do! Just keep in mind some of the issues I listed, specifically those about the github action still being on workflow_dispatch and the PAT secret for workflow permissions ones.
Currently the on field for update-repos.yml is set to workflow_dispatch (mostly for testing purposes), how should it actually be handled?
It's ok to be manual for now, but (additionally) having it run as a weekly cron job might be better even.
For update-repos.yml I needed to create my own PAT secret that had workflow permissions, since flake-repos directly modifies the cron-flakes.yml workflow. Or should there be a different approach to updating the files and directories altogether?
Perhaps creating automated PRs would be preferable, yet might require its own additional permission
Because of how the serde_yaml crate deserializes YAML values to Strings, whenever flake-repos updates cron-flakes.yml's group field the formatting for the file messes up a bit. It's still functional but the spacing and newlines are effected.
We could write to a separate action just for automated flakes this way we could keep our preferred manual spacing in the manual file
When multiple repo types are implemented, if two of the same repos but of different repo types are found, should one be considered a duplicate and be removed? If so which repo type would recieve priority to stay?
Do you mean forks or repos mirrored on different platforms?
When multiple repo types are implemented, if two of the same repos but of different repo types are found, should one be considered a duplicate and be removed? If so which repo type would recieve priority to stay?
Do you mean forks or repos mirrored on different platforms?
@ysndr I think what @sophrosyne97 is referring to is the following. Given 2 repositories that hold code for the package imaginary. They differ in no way except that one is hosted on github, and the other on gitlab.
# flakes/imaginary.toml
[[sources]]
type = "gitlab"
owner = "nix-community"
repo = "imaginary"
[[sources]]
type = "github"
owner = "nix-community"
repo = "imaginary"
Which one should take "priority" over the other?
My opinion on this issue is that they should be considered as 2 different entries. However, this would cause the frontend to display the item 2 times. Perhaps adding an option to filter on where a flake comes from would make sense.
That's what I think as well We cannot tell it is the exact same thing (efficiently). However we should mark clearly that the origins are different or merge them on the frontend to avoid duplications
What is the status of this PR? /cc @ysndr