objc2 icon indicating copy to clipboard operation
objc2 copied to clipboard

Automatic binding generation

Open madsmtm opened this issue 1 year ago • 8 comments

This is very similar to using bindgen, but I went with creating things from scratch because:

  1. bindgen is kinda intimidating, knowing so little about libclang, so I started here first; I'd still like to upstream some of this, but I'll start here for now.
  2. This only has to run on my machine every once in a while, whereas bindgen has to handle all C code under the sun. A few manual fixes and such are acceptable.
  3. I'd like to run this multiple times for different targets (iOS, macOS, tvOS, ... - GNUStep will probably suffer a bit here), and then merge the result using #[cfg] so that we can ship one, ready-to-use library.
  4. I'd like to enrich the output using separate files (.apinotes or similar), and such a thing will probably never belong in bindgen
  5. Other things that are harder to integrate into something existing, for example availability attributes.

That said, I have taken inspiration from the existing Objective-C implementation in bindgen, credits to @simlay.

In the end, I'm envisioning something like this (so objc2 would be the Apple equivalent to the windows crate):

objc2/src/
  foundation/
    generated/
      ...
    unsafe_fns.toml // Or `Foundation.apinotes`, or similar
    string.rs
    mutable_string.rs
    ...
  appkit/...
  core_data/...
  core_display/...
  // And so on, for the Apple frameworks that expose an Objective-C interface

madsmtm avatar Sep 08 '22 12:09 madsmtm

I estimated the size of the code if we included all of Apple's frameworks (assuming Rust and C have similar line ratios); it would be around 13MB zipped, 50MB unzipped. For reference, the windows crate is ~14MB zipped, ~200MB unzipped.

madsmtm avatar Sep 09 '22 17:09 madsmtm

Related projects:

  • https://github.com/youknowone/apple-sys
  • https://github.com/indygreg/apple-platform-rs (apple-sdk)
  • https://github.com/apple/swift/tree/swift-5.7-RELEASE/lib/ClangImporter

madsmtm avatar Sep 09 '22 18:09 madsmtm

I probably won't put this under objc2, instead I'll create a new crate with the frameworks. Possible names, in order of my current preference:

  1. apple, is an empty crate owned by @zu1kd, have sent them an email to discuss name transfer
  2. icrate, a play on Apple's "i"-prefixing (iPhone/iPad/...)
  3. ilib, similar to above
  4. frameworks, ambiguous since also commonly used for "web frameworks"
  5. appel, wordplay

madsmtm avatar Sep 19 '22 11:09 madsmtm

Example of how I want at least some of AppKit to look like: https://github.com/rust-windowing/winit/tree/fafdedfb7d3a7370ca4b01108f7713b685633164/src/platform_impl/macos/appkit

madsmtm avatar Sep 23 '22 22:09 madsmtm

init methods always return the same type as the receiver: https://clang.llvm.org/docs/AutomaticReferenceCounting.html#related-result-types

How can we model this nicely?

madsmtm avatar Oct 04 '22 21:10 madsmtm

Idea: use Swifty names? https://github.com/apple/swift-evolution/blob/main/proposals/0005-objective-c-name-translation.md

madsmtm avatar Oct 05 '22 18:10 madsmtm

Idea: Use (parts of) Swift's test suite (for example this one for errors: https://github.com/apple/swift/blob/7123d2614b5f222d03b3762cb110d27a9dd98e24/test/Inputs/clang-importer-sdk/usr/include/errors.h)

madsmtm avatar Oct 05 '22 18:10 madsmtm

Clang has "module maps" for figuring out what to import from Objective-C headers - we should use those!

madsmtm avatar Oct 10 '22 18:10 madsmtm

Hey,

I'm really excited about this enhancement and would find it super useful for some work I'm doing on binding the AuthenticationServices framework. That in mind, I have a couple of high-level questions.

  1. do you have a rough ETA for when this thing might land? If it's soon-ish I'll probably hold off on writing my own bindings but otherwise I might just proceed on my own.
  2. if you think this is still a little ways out, is there any way I could help to bring this around sooner?

ericmarkmartin avatar Oct 30 '22 19:10 ericmarkmartin

Currently I'm trying to push forwards on an initial version that actually compiles Foundation and AppKit.

Primary missing items are:

  • Proper imports
  • Struct definitions
  • A few remaining typedefs
  • https://github.com/madsmtm/objc2/pull/244
  • The remaining stuff to get things to compile

Timeline on this is unknown, but I have quite a lot of free time this week, so I should be able to get pretty far. I can probably give you a better estimate at the end of the week, if I'm not done by then.

Once I have this finished, there'll be lots of smaller things to work on, which I could probably use some help with, including:

  • Actual type-safety (creating newtypes instead of type-aliases for enums, typedefs and so on, allowing us to later on mark various methods as safe)
  • Extern functions
  • Inline functions (requires quite a lot, since we effectively have to translate C code to Rust. But we can probably get quite far with a few tricks)
  • Block support
  • Protocols: #250
  • "out" parameters: #277
  • Autogenerated documentation (using code comments, and ideally data from developer.apple.com)
  • Possibly a more "swifty" translation (tweaking method names, making enums associated constants). Probably better to do after we have autogenerated documentation, since there is great value in the current design where you can very easily find what you're looking for

I'll try to see if I can figure out a good way for you / somebody else to contribute with some of that, when I get there.

madsmtm avatar Oct 31 '22 19:10 madsmtm

I could use some guidance though: I'll admit I'm not the best at git, and I'm unsure if storing the generated result in git is the best solution? It does make audits (security and just general) easier, at the cost of blowing up the amount files we store. An alternative could also be to create a new repo just for it.

But what do you think?

madsmtm avatar Nov 01 '22 14:11 madsmtm

I tend to like using a generator crate in a build script

Yeah, I'm aware of that approach, my desire with header-translator/icrate was to do things a bit differently than bindgen, to improve:

  • Compilation speed, since the work is only done once on my local machine, and not on every user's machine (here I'm hoping that download speed will not be a limiting factor)
  • Documentation, since items that are only available on one platform would be hidden behind appropriate cfgs
  • The need to have a specific SDK version installed (would help with cross-compiling on Linux)

I believe it can change from macOS/SDK version to version.

I think my desire was to manually keep icrate up to date with new SDK versions. I'm currently compiling for the macOS 12.3 SDK, on a machine running macOS 10.14.6.

madsmtm avatar Nov 01 '22 20:11 madsmtm

Status: cargo build -picrate --features Foundation now works! 🎉

A clean build takes ~25 seconds, a build with a single file changed takes ~11 seconds (on my pretty old machine).

Using cargo +nightly rustc -picrate --features Foundation -- -Z self-profile && summarize summarize icrate-{pid}.mm_profdata, we can get rustc to tell us where it's spending its time. Clean build:

+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| Item                                                | Self time | % of total time | Time     | Item count | Incremental result hashing time |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_codegen_emit_obj                        | 19.54s    | 35.855          | 19.54s   | 256        | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_passes                                         | 6.88s     | 12.623          | 6.88s    | 1          | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| codegen_module                                      | 3.91s     | 7.181           | 4.64s    | 256        | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| expand_crate                                        | 3.86s     | 7.072           | 3.86s    | 1          | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| codegen_crate                                       | 2.58s     | 4.728           | 7.24s    | 1          | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| typeck                                              | 1.99s     | 3.658           | 2.27s    | 15323      | 59.69ms                         |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| mir_borrowck                                        | 1.71s     | 3.144           | 3.99s    | 15323      | 7.70ms                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| mir_drops_elaborated_and_const_checked              | 743.11ms  | 1.363           | 888.30ms | 15323      | 4.41ms                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| incr_comp_intern_dep_graph_node                     | 654.54ms  | 1.201           | 1.14s    | 1556795    | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_optimize                                | 640.50ms  | 1.175           | 640.50ms | 256        | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| mir_built                                           | 627.18ms  | 1.151           | 1.06s    | 15323      | 162.19ms                        |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_codegen                                 | 569.89ms  | 1.045           | 20.11s   | 256        | 0.00ns                          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
// SNIP

Single file changed:

+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| Item                                             | Self time | % of total time | Time     | Item count | Incremental load time | Incremental result hashing time |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| expand_crate                                     | 4.48s     | 38.074          | 4.48s    | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| generate_crate_metadata                          | 639.36ms  | 5.439           | 1.87s    | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| incr_comp_load_dep_graph                         | 536.81ms  | 4.567           | 536.81ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| hir_crate                                        | 483.00ms  | 4.109           | 582.56ms | 1          | 0.00ns                | 4.18µs                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| incr_comp_encode_dep_graph                       | 376.30ms  | 3.201           | 376.30ms | 1569613    | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| resolve_access_levels                            | 313.73ms  | 2.669           | 313.73ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| link_rlib                                        | 276.83ms  | 2.355           | 276.83ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| incr_comp_query_cache_promotion                  | 245.21ms  | 2.086           | 352.09ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| encode_query_results_for                         | 243.17ms  | 2.069           | 243.17ms | 61         | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| optimized_mir                                    | 242.48ms  | 2.063           | 244.80ms | 80         | 242.43ms              | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| monomorphization_collector_graph_walk            | 224.45ms  | 1.910           | 523.33ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| late_resolve_crate                               | 216.57ms  | 1.843           | 216.57ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| incr_comp_serialize_result_cache                 | 176.08ms  | 1.498           | 419.39ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| privacy_access_levels                            | 157.35ms  | 1.339           | 248.73ms | 1          | 0.00ns                | 3.16ms                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| codegen_copy_artifacts_from_incr_cache           | 122.74ms  | 1.044           | 122.74ms | 256        | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
| crate_lints                                      | 118.53ms  | 1.008           | 296.36ms | 1          | 0.00ns                | 0.00ns                          |
+--------------------------------------------------+-----------+-----------------+----------+------------+-----------------------+---------------------------------+
// SNIP

I would like to improve this, but don't really know where I should look. Perhaps I'll try to see if putting everything in one file is better? Otherwise, maybe extern_methods! can be made more efficient?

madsmtm avatar Nov 02 '22 04:11 madsmtm

Putting everything in one file does not seem to affect the result in any meaningful way, so doing the imports in a different way also probably won't have any impact.

madsmtm avatar Nov 02 '22 05:11 madsmtm

For comparison, I tried compiling the following:

[dependencies.windows]
version = "0.43.0"
features = [
    "Foundation_Collections",
    "Foundation_Diagnostics",
    "Foundation_Metadata",
    "Foundation_Numerics",
    "Data_Xml_Dom",
    "Win32_Foundation",
    "Win32_Security",
    "Win32_System_Threading",
    "Win32_UI_WindowsAndMessaging",
]

There, the compiler spent its time as follows:

+-----------------------------------------------------+-----------+-----------------+----------+------------+
| Item                                                | Self time | % of total time | Time     | Item count |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_module_codegen_emit_obj                        | 9.65s     | 36.293          | 9.65s    | 16         |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_passes                                         | 2.74s     | 10.320          | 2.74s    | 1          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| codegen_module                                      | 2.10s     | 7.903           | 2.42s    | 16         |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| typeck                                              | 1.84s     | 6.934           | 2.05s    | 19480      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| mir_borrowck                                        | 1.44s     | 5.413           | 2.87s    | 19480      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| mir_drops_elaborated_and_const_checked              | 743.73ms  | 2.798           | 862.04ms | 19480      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| expand_crate                                        | 525.88ms  | 1.978           | 551.30ms | 1          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| hir_crate                                           | 430.01ms  | 1.618           | 473.48ms | 1          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| eval_to_allocation_raw                              | 347.30ms  | 1.307           | 886.66ms | 13881      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| mir_built                                           | 336.45ms  | 1.266           | 608.52ms | 19480      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| check_well_formed                                   | 326.72ms  | 1.229           | 565.58ms | 28414      |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| LLVM_module_optimize                                | 319.08ms  | 1.200           | 319.08ms | 16         |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
| generate_crate_metadata                             | 315.63ms  | 1.187           | 1.76s    | 1          |
+-----------------------------------------------------+-----------+-----------------+----------+------------+
// SNIP

Notably, we spend more time in the expand_crate step (since the windows crate has expanded most macros by default), but percentage-wise everything else looks roughly the same.

Thinking about it, it does make sense that codegen'ing icrate takes a long time, every method we have there requires creating new static for the selector along with a generic call to MessageReceiver::send_message that needs to be fully instantiated!

madsmtm avatar Nov 02 '22 14:11 madsmtm

Note to self: Swift has a pretty comprehensive test suite for their Objective-C import functionality, maybe we could reuse a few headers from there? https://github.com/apple/swift/blob/7123d2614b5f222d03b3762cb110d27a9dd98e24/test/Inputs/clang-importer-sdk/usr/include/errors.h

madsmtm avatar Nov 03 '22 22:11 madsmtm

Wrt. compilation times: I'm considering adding features such as Foundation_NSArray to enable functionality from Foundation/NSArray.h (NSArray, NSMutableArray, and probably all the things that are transitively enabled by this, such as NSString, NSObject, NSOrderedCollectionDifferenceCalculationOptions, ...). Of course the Foundation feature would still just enable everything.

madsmtm avatar Nov 04 '22 00:11 madsmtm

AppKit now compiles!

Clean build 1m 30s, clean check build 52s.

madsmtm avatar Nov 04 '22 02:11 madsmtm

Do you have instructions somewhere on what you're doing to build appkit? I'd like to try building AuuthenticationServices

ericmarkmartin avatar Nov 04 '22 04:11 ericmarkmartin

Do you have instructions somewhere on what you're doing to build appkit? I'd like to try building AuuthenticationServices

There's a bit of docs in header-translator/README.md, but for adding new frameworks you need to do a few extra. I've pushed commit https://github.com/madsmtm/objc2/pull/264/commits/65643ff513adae75d84bf33db33c3ded71753c3e which adds support for AuthenticationServices, so you can try that.

As you can see there, I needed to do a few manual fixes to work around e.g. ASViewController being a typedef for NSViewController or UIViewController. Since the different frameworks are enabled with features, I'd have a bit of a hard time enabling AppKit or UIKit conditionally, so I just stubbed it out for now.

Note also that protocols are not properly supported yet, so it is currently probably currently of limited use.

madsmtm avatar Nov 04 '22 19:11 madsmtm

I've thought about the git situation some more, and I think I'll create a separate repository containing only the generated artifacts, and link it to this one as a git submodule. The reasoning is to avoid blowing up the git history of this project, as well as to protect the rest of the project against DMCA takedowns in case Apple decide that they don't like autogenerated derivations of their header files.

Speaking of, does anyone here know which license it would be proper to release icrate under? The license field supports OR and AND, so we could do something there? Or maybe we should just provide our own license-file field, and then specify that the user must comply with Apple's terms there?

madsmtm avatar Nov 09 '22 22:11 madsmtm