jadx icon indicating copy to clipboard operation
jadx copied to clipboard

Future of `.jobf` files

Open NebelNidas opened this issue 2 years ago • 4 comments

What's the purpose of .jobf files? I see three points:

  1. Exporting renamed class/field/method names. While they can do that, they are quite useless compared to the formats I added support for in #1505, since they can't handle comments, method args or method vars, and they include auto-generated names (why would you want to export them?).
  2. Using them as some sort of "cache" for the auto-generated names. I don't think that's a good argument since these are generated just as fast as they are read from disk, and caching should be done in JADX's cache folder anyway.
  3. Importing deobfuscation mappings. That's the only use case that justifies .jobf's existence IMO, since you can't do that easily otherwise. But since I'm currently working on mapping-io import support (as a follow up to #1505), which, as noted above, is much better as the formats can handle more features, this point also becomes obsolete.

So, why am I opening this issue? Well, since I'm already working on mapping-io import support, I would like to replace this mediocre file format and the stuff surrounding it. I'd like to have something similar to how Enigma handles it:

  • Import mappings via the GUI or via the--deobf-map-file launch parameter (renamed from --deobf-cfg-file)
  • Rename --deobf-cfg-file-mode to --deobf-map-file-mode and make the following adjustments:
    • Remove overwrite and ignore as they wouldn't do anything, since we won't automatically load files from the working dir anymore, the user now has to explicitly specify the file he wants to load
    • Rename read-or-save to read-and-autosave-every-change
    • Add read-and-autosave-before-closing, which will autosave the mappings only when the user exits the program / closes the project
  • Rework the internal rename tracking state management. We should have a static MemoryMappingTree somewhere from which we load mappings and to which we write mappings. This should ideally eventually replace the current system of JadxCodeRenames and JadxCodeComments, but that will be work for another PR (my idea is to use Tiny v2 as a base, but add JADX's SsaVars into it somewhere so method vars entirely generated by the decompiler can still be handled). For the current PR though, these two systems will exist side by side.

What do you think?

NebelNidas avatar Jun 12 '22 13:06 NebelNidas

include auto-generated names (why would you want to export them?)

That is the main reason why jobf was added: to keep aliases unchanged between different jadx version (changes in deobfuscator and jadx processing/decompilation). Also, it was handy to have a simple format for manual edit or scripting (some people implement their own deobfuscator this way). So jobf format will be kept in jadx for compatibility reasons anyway.

and caching should be done in JADX's cache folder anyway.

Hm. I actually can save jobf file in cache now, thanks! I disable saving jobf file by default, because people complain that such files pollute their folders, but now cache folders everywhere 😄 Maybe I should move all cache folders to some common system directory :thinking:... ok, that's another issue

Rework the internal rename tracking state management. We should have a static MemoryMappingTree ...

Jadx already have such place to store all data: RootNode and aliases stored in *Info classes (check this method). I can add an easier to use plugin interface for that.

This should ideally eventually replace the current system of JadxCodeRenames and JadxCodeComments

I think this is a bad idea, because as your PR shown other formats can be hard to convert/apply for jadx internal structures, also these format don't support dex, and use java bytecode internal info. So I prefer to use jadx own implementation and file formats for keeping user and project data.

Rename --deobf-cfg-file-mode to --deobf-map-file-mode and make the following adjustments:

Well, rename cfg-file to map-file is fine, but this should be done in easy for migration way: keep both names and print deprecation warning if old version is used. And this is too much hassle, so I prefer not to do this. Also, you want to remove overwrite but next add read-and-autosave-* which is same for me. And these option can be used than you reopen project (it will be nice to save path to map file into project) so jadx will load these files on open (so ignore can be useful to check code without applying mappings)

And ... looks like I reject all your ideas, so I will share my vision of these changes:

  • First we need to add new formats and only after some time we can discuss removing old formats
  • I want to keep only decompilation code in jadx-core so all not related new features should be added to external and optional modules (jadx-gui fine for now until plugin api not quite ready). In short: in jadx-gui do anything you want, but keep away from jadx-core :rofl:
  • For import mappings: new methods for adding custom jadx passes should be added (similar to #1482), so you can implement similar to RenameVisitor pass (pre-decomplation stage) and add it in jadx-gui, there you will have access to RootNode and all classes and methods. To access variables inside methods, you will need to add a pass similar to CodeRenameVisitor (decompilation stage pass).
  • These new passes can be added only if --deobf-map-file set or saved path in jadx project (i.e field deobfuscationMapFile set in JadxArgs)
  • Such approach will allow to load user renames and use deobfuscator at the same time. Also, it will allow using these passes in jadx-cli after extraction into plugin module.

skylot avatar Jun 12 '22 15:06 skylot

That is the main reason why jobf was added: to keep aliases unchanged between different jadx version

Interesting, I couldn't think of a reason why anybody would want this before, but you're right, it might be annoying if your workflow got disrupted by updating to a new JADX version with a different naming behavior, yup.

So jobf format will be kept in jadx for compatibility reasons anyway.

But only for the next few releases I hope? Because since we're already using mapping-io, it would be best to use format supported by it to reduce the maintainability burden. I'd suggest leaving .jobf import support in until the next major release (major releases allow breaking changes), but automatically export to a newer format.

Hm. I actually can save jobf file in cache now, thanks!

Good idea, yeah, it won't bother anyone in there. 😉 This would allow us to remove the --deobf-cfg-file and --deobf-cfg-file-mode arguments in the future, as it would just be a regular cache file without special value (users could place the file manually in the cache dir if they absolutely need a custom old version, but I don't think anyone would actually do that). For the moment, this is what I'll do:

  • Keep --deobf-cfg-file for importing .jobf files (mark as deprecated with no replacement, since it will be moved to the cache dir)
  • Keep --deobf-cfg-file-mode (mark as deprecated with no replacement)
  • Add --renames-mapping-file for importing mapping-io supported formats (I avoided the word deobfuscation to prevent confusion with the auto-deobfuscation JADX can do)
  • Add --renames-mapping-file-mode (see my initial comment)

~~I think this is a bad idea, because as your PR shown other formats can be hard to convert/apply for jadx internal structures~~

~~I wouldn't say so, you would be able to use mappings from mapping-io just as easily as JadxCodeRename objects, you'd just have to structure the code a little differently compared to the current design (which I'm doing anyways now for my PR). And yes, the current mapping formats are all targeted towards Java bytecode, but as I already said, that doesn't prevent us from taking e.g. Tiny v2 and modifying it slightly to accommodate JADX's needs (would be like three or four little changes in the spec, we could name the format JadxTiny).~~

~~So I prefer to use jadx own implementation and file formats for keeping user and project data.~~

~~JADX's project data would be mostly unaltered, the only thing I'd like to do is moving the rename data out of the .jadx file and into a .jadxtiny mapping file, see above.~~

Maybe not such a good idea after all, I'll have to think about it.

And these option can be used than you reopen project (it will be nice to save path to map file into project)

Good idea!

In short: in jadx-gui do anything you want, but keep away from jadx-core 🤣

I'll try to stay away from modifying it as much as I can! 👍 Will auto-deobfuscation also be moved into a plugin eventually?

For import mappings: new methods for adding custom jadx passes should be added [...]

Thanks for the tip! :) I'll open a PR once I've figured most of it out.

NebelNidas avatar Jun 12 '22 18:06 NebelNidas

But only for the next few releases I hope? ... to reduce the maintainability burden

Why you want to remove everything? :joy: There is no any burden to maintain stable and tested features, but it is a real nightmare to maintain new features (i.e. bugs everywhere) :rofl:

Will auto-deobfuscation also be moved into a plugin eventually?

Yes, I really want to do this, also this is a first candidate for scripting support (#1175), because current deobfuscation conditions hard to understand, and they don't support real world cases. Also, many people don't like new generated names because they don't make code easier to read. So main goal is to provide an easy way to change these functions. This can be a several bundled implementation, 3rd party plugins or user scripts.

skylot avatar Jun 12 '22 19:06 skylot

Why you want to remove everything? 😂 There is no any burden to maintain stable and tested features, but it is a real nightmare to maintain new features (i.e. bugs everywhere) 🤣

Well, having to deal with only one library for reading/writing mapping files is certainly easier than having two ;) And since cache files aren't user-facing anyway, I don't really see a problem. mapping-io is used in many projects, so it certainly can be considered stable, and on top of that the new format would be intercompatible with other programs, which is always good to have 🙂

Also, many people don't like new generated names because they don't make code easier to read

Yeah, I was really confused when I first toggled this feature on in JADX, it didn't help in any capacity 😅 Being able to register custom naming plugins sounds great! Enigma does provide such an interface, maybe you can look at how it's handled there. Here's an example plugin: https://github.com/QuiltMC/quilt-enigma-plugin

NebelNidas avatar Jun 12 '22 21:06 NebelNidas