bandersnatch
bandersnatch copied to clipboard
Can --dry-run be added to the mirror function?
It would be useful to see a manifest of what packages are going to be downloaded before you download them. Could this be added as a feature?
Sure, I’d accept that PR.
Can you elaborate what data and statistics you’d want printed out? I feel we could:
- list all files
- give a count of how many packages and files that would be downloaded
- calculate the size
^^ to do a lot of this tho we’d have to download the JSON metadata of changed / all packages.
Example: If we detected full sync would you want the dry run to just say a full sync would occurs or do a metadata run to try and tell you file count + size of the downloads / space required on disk (this wouldn’t be fast)?
A list of all the files organised into projects, a file count and size estimate would be an excellent help.
A way to confirm what is being filtered, allowlisted / blocklisted and by what rule in bandersnatch.conf, before committing to starting the download of 7TB~ of pypi, that would really help with the creation of offline repos.
Is it possible to tie the existing "diff-file" module into the dry-run command, to just give you the diff file at the end of the mirror --dry-run process and not the packages, with a nice summary of the total size / what needs to be updated if you're working from a previous diff file, etc.