john icon indicating copy to clipboard operation
john copied to clipboard

Brings initial Rust support to john

Open claudioandre-br opened this issue 3 years ago • 4 comments

If merged, one should:

  • create a new format as a C file that references the rust code as 'outer' code [1];
    • Then, develop the library in rust and export the necessary names/symbols;
    • Next put (external) function names in struct fmt_main fmt_rust as usual
  • Autotools tooling will build "rust format(s)" if rust is installed;

Done or doing:

  • I'm re-implementing the dummy format in rust (I'm out of ideas);
  • Right now, I'm already calling the valid() that was developed in rust and configure/make are detecting if 'rust' is present.
  • New developers could take 'my' dummy format as an example.

Why:

  1. This could produce good noise within the JtR community;
  2. Developing a new format in rust might not be all that interesting, but create more complex stuff could be cool, e.g., (re)creating a mode sounds interesting.

[1] at the top of the C file:

// Routines imported from Rust.
void valid_in_rust();

claudioandre-br avatar Oct 18 '22 21:10 claudioandre-br

I have mixed feelings about this. On one hand, I think Rust is a good language and a good fit for future projects like ours (not without drawbacks, though). On the other hand, I think it'll actually hurt our existing project if we start adding a little bit of Rust (and a dependency on it for building the corresponding functionality). We would have the drawbacks mostly without the advantages.

So for this existing project, maybe if there's ever existing non-trivial third-party code in Rust that we want to reuse. For example, someone external to our project develops and maintains a complex candidate passwords generator in Rust, and it's so good we want to have it, then I wouldn't object to us adding it as an optional feature (with optional build-time dependency on Rust).

Just for the good noise and bad flames in the community, probably no, but I'm happy to hear you @claudioandre-br are learning Rust and practicing with integrating its usage in our tree. Those skills are going to be useful, whether for this project or elsewhere. I wish I had the time to learn Rust now, too.

solardiz avatar Oct 19 '22 16:10 solardiz

I think it'll actually hurt our existing project if we start adding a little bit of Rust

Potentially.

  • But, how do you add th1a to john? Copying code? Would you like to also test Wyhash or FxHash (a simple hash from Mozilla)?
    • All these are vendored and ready to use in crates.io; 5 lines of rust is needed to use each of them.
  • I remember john has some copied code needing update. If available on crates.io, this would be another potential improvement.

And so on. Drawbacks and improvements are expected, as always.

As one example, if you really want to play with th1a and john, using rust will make you like easier. But, copy code and handle some compiler warning is not so difficult. So, it is a choice.

A quick quest: does anyone remember anything in the john tree that needs updating? Is it available on crates.io?

claudioandre-br avatar Oct 19 '22 21:10 claudioandre-br

For tiny things like this, I think it's actually preferable to stay with sometimes out-of-date but reviewed and tested by us, than to have easy auto-updates. Such micro-dependencies remind me of what I read about the JavaScript/npm world, where it got crazy, and where a security compromise of a trivial package could quickly propagate to lots of large and important projects. So yes:

But, how do you add th1a to john? Copying code? Would you like to also test Wyhash or FxHash (a simple hash from Mozilla)?

For these, we should be copying their code. Since they'd be so performance critical for us, chances are we'd also customize e.g. for our expected input lengths or for not knowing the length in advance. (One advantage of our current trivial hash functions e.g. such as in unique.c is that they work on NUL-terminated strings as input, whereas implementations of those third-party hashes generally accept a previously known length, so we'd waste time on a strlen before calling them... or would need to switch to usage of "Pascal strings" first.)

solardiz avatar Oct 19 '22 21:10 solardiz

As a side note:

  • Semantic Versioning (SemVer) allows you to avoid these automatic updates. Let's just say these big projects weren't doing the right thing;

claudioandre-br avatar Oct 19 '22 22:10 claudioandre-br

We don't need this change nor even discussion currently. We have plenty of real issues and goals. Closing. We can always revisit/reopen if and when needed.

solardiz avatar May 25 '24 23:05 solardiz