john
                                
                                 john copied to clipboard
                                
                                    john copied to clipboard
                            
                            
                            
                        Fix requirements.txt
Fixes #5474
I'm not into Python. I'd appreciate to hear from someone who is, @exploide maybe? Thank you!
Ok, I see multiple issues here.
- protobuffwith two f's is indeed incorrect. The correct package is named- protobufand the one with two f's doesn't even exist. However it could be added afterwards, which would result in a typosquatting attack if it contains malicious content. So this is directly a security vulnerability.
- Pinning the version down to an older version forever is probably not what we want. We should aim to make the script compatible with the latest release of protobuf.
- A top-level requirements.txtin the john repo is probably not appropriate. This particular dependency only affects a single script. And since the scripts are more or less stand-alone and independent, it could as well be the case that another script just requires a different version of protobuf (hypothetical example).
- There is a bunch of other dependencies needed from within other scripts. And these are all not mentioned in this version of requirements.txt. It is very specific for the script that requires protobuf.
So I see two solutions: Fix the incompatibility in the script to make it work with latest protobuf. Or if this requires too much effort for the moment, include the version constraint in the pip install instruction suggested by the import exception handling instead.
A top-level requirements.txt does not feel right for a project like john. But I'm not the one to decide.
Good catch, I was moving too fast and made that typo. Fixed just now.
I would strongly suggest adding this requirements.txt file with whatever versions are required to make it work right now. Then, you can upgrade dependencies in the future if that's a desire.
I also just added all the other dependencies to the requirements.txt file which are mentioned via a pip install printed message. I didn't put versions on them, but they should all have pinned versions when tested.
Thank you for the review @exploide! I'm not sure what is best to do here. @RecRanger argues for this approach further in #5474.
@RecRanger Thank you for the fix and additions. When fixing an error in a commit within a pending PR, we normally amend the commit and force-push - not add another commit. We also prefix commit titles with subsystem name whenever relevant (would be "requirements.txt: " starting with the second commit here). In this case, though, I suggest merging all 3 commits into one. We can do that at PR merge time by using GitHub's "squash and merge" feature, although this will make GitHub appear as the committer in git history.
@RecRanger Where does the dependency on pycrypto come from? Is it indirect? Do we need to list indirect dependencies? (This may be fine. I am just asking.)
I personally always use Squash-and-Merge when working in semi-anon/informal teams (e.g., open source projects) to make reverting changes easier. Will do better next time though!
I got the library list with a grep 'pip install' search. The pycrypto library came from:
- telegram2john.py
- DPAPImk2john.py
Indirect dependencies should not be listed. Libraries are assumed to have valid pyproject.toml/setuptools configurations which indicate which dependencies to install.
From #5474
I don't see anything wrong with continuing to use the older version of Protobuf (which is tested, validated, etc.). Upgrading the dependency seems like a pointless endeavor for no gain.
In principle, I would always recommend to make a software compatible with the latest version of a dependency. For example, it might happen that a vulnerability becomes known for the older version or that there is an unpleasant bug. Furthermore, if someone decides to package this for a Linux distribution, they will fail because they cannot ship an older version of a Python package than the latest one already packaged for the distribution.
The error message explicitly says that the current Python code uses features that are deprecated in protobuf>3.20. We'd likely have to regenerate the Python protobuf files, iirc (which is a huge pain).
But in this case, I agree. I don't have any experience with protobuf file generation and if it is a huge pain, then we should not enforce it for the moment. (But welcome it when someone likes to do the upgrade.)
I also just added all the other dependencies to the requirements.txt file which are mentioned via a pip install printed message. I didn't put versions on them, but they should all have pinned versions when tested.
There are different points of view on this topic. Personally, I don't like pinning version numbers as long as there is no process of keeping them up to date. Otherwise you will soon have a requirements.txt which just contains a bunch of horribly outdated package versions. Which I don't like for reasons as explained above. I would only introduce version constraints where they are technically necessary. However, I know that there are different approaches which emphasize the desire of having a tested set of working dependency versions.
I already tried to explain that a combined global requirements.txt doesn't feel intuitive for me, because this is not a Python project at all. john is a C software and then there are just a bunch of standalone Python scripts as auxiliary tools. And not only Python, there is also Perl and maybe some other stuff.
I don't see a big advantage of a requirements.txt here. But I believe you see an advantage. So it is fine for me to include a requirements.txt but again, I'm not the one to decide.
Since the file is only useful for the scripts in run/ I think requirements.txt should also be moved to run/. As I mentioned, john itself is not a Python project. But this is not really important.
Where does the dependency on pycrypto come from?
Indeed, pycrypto is a dependency. However, it is insecure and deprecated and should no longer be used. There is pycryptodome as a drop-in replacement. Is has the same API. So please list this as a requirement instead. I will conduct a separate PR soon, which also adapts the error message within the scripts.
Sorry for this lengthy answer. Just wanted to make my points clear. Whether to include a requirements.txt or not is up to @solardiz
Thank you guys for the discussion. I had actually thought of us adding a requirements.txt and related documentation before, when I saw that the btcrecover has that, although sure enough they're primarily a Python project and we're not. Incidentally, IIRC our two pre-generated protobuf files came from btcrecover, so maybe we should look there (at all forks of it) for any updates.
Since the file is only useful for the scripts in
run/I thinkrequirements.txtshould also be moved torun/. As I mentioned, john itself is not a Python project. But this is not really important.
This may actually be a good idea. It could possibly stay available in a distro package that includes our scripts, but does not automatically get the Python modules pre-installed as the distro package dependencies. What do you think, @RecRanger?
Projects should work out of the box. This PR is how you make it work out-of-the-box.
Ideas for AFTER this bare-minimum PR is merged
- If you want opt-in requirements for features, then setup a pyproject.toml.
- Requirements can be pinned with version ranges.
- Bots can be used to bump requirements, given adequate test cases.
There is no perfect solution for balancing security, functionality, and developer experience. This is the most basic solution. Let's upgrade incrementally from a bare-minimum.
I've just merged @exploide's #5478 (obviously triggered by your work here @RecRanger, thanks!) so now we need to revise the changes here accordingly.
Revised the "pycrypto" to "pycryptodome" dependency. Also squashed the commits.
Thanks, but now the commit message records change history within this PR, which is temporary and irrelevant for our git history. So if it's just one commit, it should say just "Add requirements.txt and update .gitignore for Python" (or this could be 2 commits).
Soooo like, is someone gonna merge this PR? As much as I'd love critique on my git technique and gitignore setup, I'm actually just here because the 100 contributors that came before me didn't setup the gitignore for 1 out of 2 languages used in this repo.
Soooo like, is someone gonna merge this PR?
OK, I did a squash-and-merge on this to set a commit message that matches the final changes. Thank you!