rust-objc icon indicating copy to clipboard operation
rust-objc copied to clipboard

Sending messages is inefficient because of selector lookup

Open jrmuizel opened this issue 8 years ago β€’ 68 comments

It would be better if selectors were resolved at compile time instead of using sel_registerName

jrmuizel avatar Nov 26 '16 02:11 jrmuizel

I've been trying to mimic how clang does codegen for selectors. If you, say, call the hash method on an object it generates:

	.section	__TEXT,__objc_methname,cstring_literals
L_OBJC_METH_VAR_NAME_:                  ## @OBJC_METH_VAR_NAME_
	.asciz	"hash"

	.section	__DATA,__objc_selrefs,literal_pointers,no_dead_strip
	.align	3                       ## @OBJC_SELECTOR_REFERENCES_
L_OBJC_SELECTOR_REFERENCES_:
	.quad	L_OBJC_METH_VAR_NAME_

And then the selector passed to objc_msgSend is just loaded from L_OBJC_SELECTOR_REFERENCES_[0].

If I naively try to mimic this, with some code like:

const char *const MY_SELECTOR = "hash";

SEL selector = MY_SELECTOR;
NSUInteger hash = objc_msgSend(object, selector);

It does not go well, terminating with an unrecognized selector after printing:

NSForwarding: warning: selector (0x1008e0f8c) for message 'hash' does not match selector known to Objective C runtime (0x7fff9b927f35)-- abort`

With the naive mimicry, the generated code looks like:

	.section	__TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
	.asciz	"hash"

	.section	__DATA,__const
	.globl	_MY_SELECTOR            ## @MY_SELECTOR
	.align	3
_MY_SELECTOR:
	.quad	L_.str

The primary difference seems to be in the sections, particularly where MY_SELECTOR isn't in the __objc_selrefs section. Searching the objc runtime source code confirms that is important; in objc-file.mm there's the _getObjc2SelectorRefs function which looks through the __objc_selrefs section. I'm assuming that at initialization the runtime looks at all the selectors in this section and registers them.

So, some questions from here that I need to resolve:

  • is there a way in rust to change what sections a constant is in?
  • what happens if the same selector is used multiple places, is there a concern of it ending up in the __objc_selrefs section multiple times? Or will the linker de-duplicate them and I don't have to worry about it?

SSheldon avatar Dec 27 '16 19:12 SSheldon

I'll respond more in a bit but I got this sort of working:

    struct Foo(*const [u8; 9]);
    unsafe impl Send for Foo {}
    unsafe impl Sync for Foo {}

    {
        #[link_section="__TEXT,__objc_methname,cstring_literals"]
        static BAM : [u8; 9] = *b"Rustling\0";
        #[link_section="__DATA,__objc_selrefs,literal_pointers,no_dead_strip"]
        static LAMP: Foo = Foo(&BAM);
    }

Unfortunately, there doesn't seem to be a great way to generate the literal 9 constant.

I'm going to try to prototype a solution with a coworker in the new year.

jrmuizel avatar Dec 27 '16 19:12 jrmuizel

Ooh interesting, good find with #[link_section]. I think it might not matter whether the strings are inside the __objc_methname section, I can't find any reference to it being read in the runtime. Putting the pointers to the strings in __objc_selrefs might be enough.

This helped me get closer to the compiler's output in C with:

__attribute__((section("__DATA,__objc_selrefs")))
char *const MY_SELECTOR = "hash";

Which produces:

	.section	__TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
	.asciz	"hash"

	.section	__DATA,__objc_selrefs,literal_pointers,no_dead_strip
	.globl	_MY_SELECTOR            ## @MY_SELECTOR
	.align	3
_MY_SELECTOR:
	.quad	L_.str

Unfortunately then it fails to link with "Undefined symbols for architecture x86_64: _MY_SELECTOR". But maybe it'll go better if I try in Rust.

SSheldon avatar Dec 28 '16 18:12 SSheldon

I recall getting a segfault on load when the method name was not in the "__objc_methname" section.

jrmuizel avatar Dec 28 '16 23:12 jrmuizel

I threw together a simple macros 1.1-abuse crate which should generate code similar to the code which @jrmuizel posted above. It's super hacky but if you wrap it in a macro_rules! wrapper it might look OK?

I also haven't ensured that the output binary looks correct yet - so that's a thing too.

https://github.com/mystor/objc_methname

mystor avatar Jan 16 '17 22:01 mystor

Heads up - we appear to suffer from this issue heavily in gfx-backend-metal.

kvark avatar Jun 12 '18 20:06 kvark

I tried this piece of code in metal-rs with no success:

impl DeviceRef {
        pub fn name2(&self) -> &str { // new method
            struct Foo(*const [u8; 5]);
            unsafe impl Send for Foo {}
            unsafe impl Sync for Foo {}
     
            let s: &NSString = {
                #[no_mangle]
                #[link_section="__TEXT,__objc_methname,cstring_literals"]
                static OBJC_METH_VAR_NAME_ : [u8; 5] = *b"name\0";
                #[no_mangle]
                #[link_section="__DATA,__objc_selrefs,literal_pointers,no_dead_strip"]
                static OBJC_SELECTOR_REFERENCES_: Foo = Foo(&OBJC_METH_VAR_NAME_);
                unsafe {
                    let selector: objc::runtime::Sel = mem::transmute(OBJC_SELECTOR_REFERENCES_.0);
                    objc::__send_message(&*self, selector, ()).unwrap()
                }
            };
     
            s.as_str()
        }

        pub fn name(&self) -> &str { // old method
            unsafe {
                let name: &NSString = msg_send![self, name];
                name.as_str()
            }
        }
}

Result is:

NSForwarding: warning: selector (0x103e80630) for message name does not match selector known to Objective C runtime (0x7fff4bad3d1d)

kvark avatar Jun 15 '18 14:06 kvark

@kvark is it certain methods being called in a tight loop that are causing a problem for gfx-backend-metal? If you can identify the methods, you could avoid this by registering the selector once and reusing it. Something like this untested code:

let sel = sel!(name);
loop {
    let name: &NSString = obj.send_message(sel, ());
}

Maybe the selectors could be loaded in a lazy static?

SSheldon avatar Jun 15 '18 19:06 SSheldon

@SSheldon we are definitely considering that. It will be unfortunate to throw this out once the real static linking of selectors work out... Speaking of which, could anyone (including @jrmuizel ) provide a small self-contained example on how that linking magic works? As I said, my little try failed because of an unknown selector.

kvark avatar Jun 15 '18 21:06 kvark

I was never able to successfully static link a selector :/

@mystor were you seeing success with the macro crate you posted? I never got around to trying it myself.

SSheldon avatar Jun 15 '18 21:06 SSheldon

We attempted to use the code @mystor provided, gone through a few iterations with them, but ended up with a linker error. The plan was to go back and try to hard-code it to see if the simplest case works. It didn't, even though for a different reason - selector being not found (run-time error).

The problem with explicit caching of sel!() is that it's hard to do without rewriting everything. We'll end up with something like a obj_class! macro that defines the methods as well as their selectors in lazy/thread-local storage. That API would be nicer than hard-coded stuff we have, but it's a large rewrite (of metal-rs crate).

kvark avatar Jun 15 '18 23:06 kvark

In the example you gave above the selectors are not being registered during init for some reason. I'll look into it further.

jrmuizel avatar Jun 15 '18 23:06 jrmuizel

I think it might be because of a missing __objc_imageinfo section

jrmuizel avatar Jun 15 '18 23:06 jrmuizel

Indeed adding:

                #[no_mangle]
                #[link_section="__DATA,__objc_imageinfo,regular,no_dead_strip"]
                static info_version: u32 = 0;
                #[no_mangle]
                #[link_section="__DATA,__objc_imageinfo,regular,no_dead_strip"]
                static info_flags: u32 = 64;

fixes the selector missing error.

jrmuizel avatar Jun 15 '18 23:06 jrmuizel

I think something is still broken in the example code that I have though.

jrmuizel avatar Jun 15 '18 23:06 jrmuizel

I confirm - the simple example works with this addition :tada: @mystor 's code is still failing to link though. One suggestion they had is that the trouble comes from the fact we have same-named selectors for different objects.

kvark avatar Jun 15 '18 23:06 kvark

For the record here's a working example:

#[macro_use]
extern crate objc;

use objc::Encode;
use objc::runtime::{Class, Object};

/// Wrapper around an `Object` pointer that will release it when dropped.
struct StrongPtr(*mut Object);

impl std::ops::Deref for StrongPtr {
    type Target = Object;

    fn deref(&self) -> &Object {
        unsafe { &*self.0 }
    }
}

impl Drop for StrongPtr {
    fn drop(&mut self) {
        let _: () = unsafe { msg_send![self.0, release] };
    }
}

fn main() {
    // Get a class
    let cls = Class::get("NSObject").unwrap();
    println!("NSObject size: {}", cls.instance_size());

    // Allocate an instance
    let obj = unsafe {
        let obj: *mut Object = msg_send![cls, alloc];
        let obj: *mut Object = msg_send![obj, init];
        StrongPtr(obj)
    };

    // Invoke a method on the object
    let hash: usize = unsafe {
        msg_send![obj, hash]
    };
    println!("NSObject hash: {:x}", hash);
    let hash: usize = {
        use std::mem;
        struct Foo(*const [u8; 5]);
        unsafe impl Send for Foo {}
        unsafe impl Sync for Foo {}
        #[no_mangle]
        #[link_section="__TEXT,__objc_methname,cstring_literals"]
        static OBJC_METH_VAR_NAME_ : [u8; 5] = *b"hash\0";
        #[no_mangle]
        #[link_section="__DATA,__objc_imageinfo,regular,no_dead_strip"]
        static info_version: u32 = 0;
        #[no_mangle]
        #[link_section="__DATA,__objc_imageinfo,regular,no_dead_strip"]
        static info_flags: u32 = 64;

        #[no_mangle]
        #[link_section="__DATA,__objc_selrefs,literal_pointers,no_dead_strip"]
        static OBJC_SELECTOR_REFERENCES_: Foo = Foo(&OBJC_METH_VAR_NAME_);
        unsafe {
            let selector: objc::runtime::Sel = mem::transmute(OBJC_SELECTOR_REFERENCES_.0);
            objc::__send_message(&*obj, selector, ()).unwrap()
        }
    };
    println!("NSObject hash: {:x}", hash);
}

What's the linking problem?

jrmuizel avatar Jun 16 '18 00:06 jrmuizel

Also, it's worth noting that the 32 bit ABI requires the use of different sections.

jrmuizel avatar Jun 16 '18 00:06 jrmuizel

          Undefined symbols for architecture x86_64:
            "metal_rs::obj_drop::do_it::REF::h9be260b02dc97181", referenced from:
                metal_rs::obj_drop::h03cfc73bd31ac039 in caps-f6dacc4f5a6f0398.1d04pb6mtwbq7wdv.rcgu.o
          ld: symbol(s) not found for architecture x86_64
          clang: error: linker command failed with exit code 1 (use -v to see invocation)

Can repro by:

git clone https://github.com/kvark/metal-rs -b testcase
cd metal-rs && cargo build

kvark avatar Jun 16 '18 06:06 kvark

git clone https://github.com/kvark/metal-rs -b testcase links without any problems for me with rust stable, beta and nightly. That being said should there not be some binary target specified for the linking to actually happen?

jrmuizel avatar Jun 16 '18 15:06 jrmuizel

cargo build --example library reproduces the link error.

jrmuizel avatar Jun 16 '18 15:06 jrmuizel

Isn't the objc branch missing the imageinfo sections from above? Not sure if #[no_mangle] needs to be specified too.

With those changes https://github.com/SSheldon/rust-objc/commit/d83687897e48233f36bf3ed5c22b300eda16d8c5 I get a different error message:

error: symbol `REF` is already defined
  --> src/commandqueue.rs:31:13
   |
31 |             msg_send![self, setLabel:nslabel]
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

grovesNL avatar Jun 16 '18 15:06 grovesNL

The link error is caused by just the following change:

diff --git a/Cargo.toml b/Cargo.toml
index c420754..7af5116 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -17,22 +17,25 @@ default-target = "x86_64-apple-darwin"
 cocoa = "0.15"
 bitflags = "1"
 libc = "0.2"
 log = "0.4"
 objc-foundation = "0.1"
 objc_id = "0.1"
 block = "0.1.5"
 foreign-types = "0.3"
 
 [dependencies.objc]
-version = "0.2.1"
-features = ["objc_exception"]
+#version = "0.2.1"
+git = "https://github.com/mystor/rust-objc"
+branch = "static_sel"
+features = ["objc_exception", "static_sel"]
+#features = ["objc_exception"]
 
 [dev-dependencies]
 winit = "0.13"
 sema = "0.1.4"
 
 [[example]]
 name = "window"
 path = "examples/window/main.rs"
 
 [[example]]

jrmuizel avatar Jun 16 '18 16:06 jrmuizel

@jrmuizel The branch static_sel of https://github.com/mystor/rust-objc is where the static selectors have been added experimentally. These changes aren't in the regular objc crate, so it's expected that using the regular crate will succeed.

grovesNL avatar Jun 16 '18 16:06 grovesNL

What I meant is that the link errors are not caused by the other changes happening in https://github.com/kvark/metal-rs/commit/2021415a46420fdb25335b2d0a5cd514117fc5b5

jrmuizel avatar Jun 16 '18 16:06 jrmuizel

error: symbol `REF` is already defined is cause by using no_mangle. We shouldn't need to use no_mangle.

I've put a version that seems to sort of work at https://github.com/jrmuizel/rust-objc/tree/static_sel.

It can successfully run cargo run --features static_sel --example example

jrmuizel avatar Jun 16 '18 16:06 jrmuizel

I believe the linker errors in metal-rs are coming from msg_send! being used in a generic function.

jrmuizel avatar Jun 16 '18 16:06 jrmuizel

I remember that generic functions and statics don't mix that well, but I don't exactly recall why.

jrmuizel avatar Jun 16 '18 16:06 jrmuizel

I confirm - was able to build metal-rs successfully after cleaning up and rewriting generic use of msg_send in it.

On Jun 16, 2018, at 09:59, Jeff Muizelaar [email protected] wrote:

I remember that generic functions and statics don't mix that well, but I don't exactly recall why.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

kvark avatar Jun 16 '18 19:06 kvark

Although... there is still winit and other ecosystem pieces that cause us to fail to link :( So the problem is still on the table

On Jun 16, 2018, at 09:59, Jeff Muizelaar [email protected] wrote:

I remember that generic functions and statics don't mix that well, but I don't exactly recall why.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

kvark avatar Jun 16 '18 19:06 kvark

I haven't been able to create a reduced test case that reproduces the problem yet. Some simple tests with generics seem to succeed.

On Sat, Jun 16, 2018, 3:54 PM Dzmitry Malyshau [email protected] wrote:

Although... there is still winit and other ecosystem pieces that cause us to fail to link :( So the problem is still on the table

On Jun 16, 2018, at 09:59, Jeff Muizelaar [email protected] wrote:

I remember that generic functions and statics don't mix that well, but I don't exactly recall why.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SSheldon/rust-objc/issues/49#issuecomment-397835496, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUTbT7_XoIvXJehyO8omitnE38iGqHaks5t9WJ9gaJpZM4K8yZd .

jrmuizel avatar Jun 16 '18 22:06 jrmuizel

I screwed up when trying to make the test case originally. Here's a simple reproduction that fails to link when built with the static_sel feature.

#[macro_use]
extern crate objc;

use objc::runtime::Class;

unsafe fn obj_alloc<T>(_p: T) {
    let cls = Class::get("NSObject").unwrap();
    msg_send![cls, alloc];
}

fn main() {
    let i: u32 = 0;
    unsafe { obj_alloc(i); }
}

jrmuizel avatar Jun 17 '18 13:06 jrmuizel

And a more reduced test case:

unsafe fn obj_alloc<T>(_p: T) -> *const [u8; 6] {
        struct Foo(*const [u8; 6]);
        unsafe impl Send for Foo {}
        unsafe impl Sync for Foo {}
        #[link_section="__TEXT,__objc_methname,cstring_literals"]
        static OBJC_METH_VAR_NAME_ : [u8; 6] = *b"alloc\0";

        #[link_section="__DATA,__objc_selrefs,literal_pointers,no_dead_strip"]
        static OBJC_SELECTOR_REFERENCES_: Foo = Foo(&OBJC_METH_VAR_NAME_);
        return OBJC_SELECTOR_REFERENCES_.0;
}

fn main() {
    let i: u32 = 0;
    unsafe { obj_alloc(i); }
}

We'll have to look at the generated assembly to see what's causing the link error.

jrmuizel avatar Jun 17 '18 14:06 jrmuizel

It looks like the problem only happens with incremental compilation. Setting CARGO_INCREMENTAL=0 seems to make it go away.

jrmuizel avatar Jun 17 '18 14:06 jrmuizel

@jrmuizel unfortunately, disabling incremental compilation doesn't help in the full case (as opposed to a mini test case). When building the portability library as "CARGO_INCREMENTAL=0 make version-release" (with a crates-io patch written in Cargo.toml) I'm still getting the link errors.

kvark avatar Jun 19 '18 13:06 kvark

Can you post the link errors?

jrmuizel avatar Jun 19 '18 13:06 jrmuizel

https://pastebin.mozilla.org/9087959

kvark avatar Jun 19 '18 13:06 kvark

I've investigated this problem.

The link error is caused by literal_pointers,no_dead_strip. For sections with those flags, Apple's ld takes a special "literal pointers" path that doesn't do all the relocations necessary for the linking that incremental compilation does to work here.

Even with that solved, though, dyld ignores the section, at least on El Capitan. On Rust binaries, it does not set the flag that indicates Objective-C is present, because the __DATA,__objc_imageinfo section is missing. This needs to be present, with 8 bytes of zeroes (version 0, no flags).

And even with that solved, rustc has no ability to combine the __objc_methname sections of downstream crates with upstream rlibs. This problem breaks all apps using cocoa-rs, because the Objective-C sections in cocoa-rs will end up missing from the applications using that crate, and therefore nearly all method calls issued by cocoa-rs will fail.

Long story short, I think that this logic really belongs in rustc. I feel the best we can do in the meantime is lazy_static!. That really isn't that badβ€”in fact, it has some advantages, because it's really just moving work that dyld does eagerly at app startup into a series of lazy on-demand operations.

pcwalton avatar Jun 19 '18 20:06 pcwalton

Thanks for your investigation @pcwalton ! It's sad to fall back to the "rewrite all metal-rs" path after feeling the win being so close now... Hopefully, rustc will improve one day to ease up the pain for future generations. Speaking of which, could you please file that bug/request upstream to rustc?

kvark avatar Jun 19 '18 20:06 kvark

@kvark Why isn't lazy_static good enough for your needs?

pcwalton avatar Jun 19 '18 20:06 pcwalton

@pcwalton it's OK as an end goal. My concern is that we'll need to rewrite all metal-rs bindings, where the currently proposed (but not functional) solution developed by @mystor and @jrmuizel implies no changes to metal-rs.

kvark avatar Jun 19 '18 20:06 kvark

Oh, I'm not sure we need downstream crates to use lazy_static! explicitly. I think rust-objc could use it internally. I'm experimenting with a PR now…

pcwalton avatar Jun 19 '18 20:06 pcwalton

FYI, sel_registerName takes a lock internally. The internal dyld codepath that registers selectors on startup batches the lock (see https://opensource.apple.com/source/objc4/objc4-723/runtime/objc-runtime-new.mm.auto.html, search for // Fix up @selector references). You might see notably better performance from batching the locking (though I understand from the above discussion that there are difficulties with that).

swolchok avatar Jun 20 '18 16:06 swolchok

even with that solved, rustc has no ability to combine the __objc_methname sections of downstream crates with upstream rlibs.

Would it work if you just threw the selector names in the ordinary C string section? I really don't think there's anything particularly special about __objc_methname. Would love to see that segfault. You could also double-check whether there's anything special about __objc_methname by building an ordinary Objective-C program and passing -rename_section __TEXT __objc_methname __TEXT __some_other_section to ld64.

swolchok avatar Jun 20 '18 16:06 swolchok

I really don't think there's anything particularly special about __objc_methname.

It seems that dyld's source does reference it in dyld3/shared-cache/OptimizerObjC.cpp (dyld source download), but it seems it's only to decide the order to load libraries in.

vincentisambart avatar Jun 20 '18 23:06 vincentisambart

Oh yeah, dyld3, and then who knows what Mojave/iOS 12 is going to bring? __objc_methname would be best, for sure.

However, I suspect that anything would be better than taking the lock in sel_registerName once per selector, at least for applications with a lot of selectors. With a small number of selectors, who cares? :)

swolchok avatar Jun 20 '18 23:06 swolchok

(I'm fairly sure that that dyld3 code path is not live for things other than Apple libraries, at least on iOS 11 and below.)

swolchok avatar Jun 20 '18 23:06 swolchok

FWIW, I've created a small project called objrs that has a collection of macros that transforms Rust code so that it compiles down to effectively the same assembly as pure Objective-C. It never calls sel_registerName or other dynamic runtime functions. It's nightly only (for now), but it might serve as some inspiration for how to make this crate more efficient.

mjbshaw avatar Aug 27 '18 04:08 mjbshaw

@mjbshaw would you be open to integrating your ideas into this crate directly? There are already quite a few crates built on top of rust-objc, so it would be unfortunate for existing/new bindings to have choose between rust-objc and objrs. Many of the existing crates would like the performance benefits though.

grovesNL avatar Aug 27 '18 05:08 grovesNL

@grovesNL objrs is also a hobby project for me to reenvision Rust+Objective-C from the ground up. I plan to keep working on it as a place to experiment batcrap crazy ideas. I'm open to integrating the ideas into this crate, but I'm not sure they're mature enough to do so yet. The objrs crate's license options are compatible with rust-objc, so people are free to incorporate all/some of the code into rust-objc. I'd be willing to provide some assistance, too.

For example, people here are mentioning that using link_section allows you to skip calling sel_registerName. That's true, but is insufficient for proper linker integration. Proper linker integration requires using special export_names too (particularly using a local symbol, which allows selector deduplicating) (and volatile pointer loads). Unfortunately, these special prefixes (which are local/private symbol names) can't cross object file boundaries, which means Rust's incremental compilation (which works by splitting an individual .rs file into many separate object files) can break the build (linking will fail if one object file attempts to refer to a selector in another object file). I've worked around that in objrs, but it's a semi-hacky solution. Ultimately I'd like to patch the Rust compiler so its incremental compilation doesn't try to divorce a static symbol from a function that refers to it.

This is just one of many thorny aspects I've had to work around. My concern is primarily the stability of the workarounds. I think the best way to integrate the parts of my objrs crate into rust-objc would be to first put together some Rust patches/RFCs to lay the necessary groundwork. I've put off doing that though because I've found core Rust development to be extremely frustrating...

mjbshaw avatar Aug 28 '18 03:08 mjbshaw

@mjbshaw That sounds great! Thanks for the explanation. I'll try to follow objrs and hopefully we can try to integrate some of the changes here as the stability improves.

grovesNL avatar Aug 29 '18 05:08 grovesNL

@mjbshaw what progress have you made since your comment? I'm looking to have static selectors for my project (issue: https://github.com/nvzqz/fruity/issues/2).

nvzqz avatar Dec 14 '20 01:12 nvzqz

@nvzqz The objrs crate is still immature and I haven't had much time to work on it lately. Creating the static selector is pretty stable and easy, though. I suppose the selector! macro could be broken out to its own (stable) crate, and that way projects like yours (and rust-objc, if they're interested) can use it if they want. If there's interest for that, I can get it done this week.

mjbshaw avatar Dec 15 '20 15:12 mjbshaw

@mjbshaw I'm curious if you run into https://github.com/rust-lang/rust/issues/80019 and if so, how you get past it.

nvzqz avatar Dec 15 '20 15:12 nvzqz

Yes, I'm able to get past that issue. The problem is that symbol names are significant for selectors and their references, and you aren't using the special symbol names required. The Selectors must use the L_OBJC_METH_VAR_NAME_ prefix for its symbol name, and the selector reference must use the L_OBJC_SELECTOR_REFERENCES_ prefix for its symbol name. Symbols with these prefixes are given special treatment by the linker.

mjbshaw avatar Dec 15 '20 15:12 mjbshaw

@mjbshaw I didn't know that was the secret sauce to make it work. I just tested making a selector with those symbols and having an objc image info static. I'm pleased to say that I got it working in my project! Thanks so much for the pointer. I'll give you credit by adding you as a co-author to the commit. πŸ˜„

nvzqz avatar Dec 15 '20 17:12 nvzqz

Any updates on this? I'm running into selector problems with https://github.com/gfx-rs/metal-rs/pull/207, and I think having compile-time selectors would really help.

expenses avatar Apr 07 '21 12:04 expenses

I've been spending a lot of time the last few weeks thinking about and fiddling with this issue. I'd like to talk about what I've gotten done:

My progress

I've been able to reproduce the optimizations described previously in this thread. I've written a library that produces the linker instructions that create compile-time selector references and class references.

I've also done a lot of testing with my implementation:

  • I replaced all the msg_send! calls in cocoa-foundation with equivalent calls to my own msg! macro. It successfully builds in both debug and release modes.
  • I then did the same with cocoa. It successfully builds in both debug and release modes.
  • Modifying winit to depend on my modified cocoa-foundation and cocoa crates, winit also successfully builds in debug and release modes.
  • cargo test succeeds in my modified winit.
  • Every example in my modified winit that I tried worked with no obvious issues or errors in both debug and release modes.
  • My own test that included both a generic and non-generic function that both call the same selector on an object successfully builds and runs in both debug and release modes.
  • My own test that ports some old Objective-C code I found on the internet to make a window that says "Hello World" successfully builds and runs in both debug and release modes.
  • In a very quick and dirty look over the generated assembly from msg_send! vs my implementation, it appears that less instructions are generated for each message call in my implementation than in msg_send!β€”in both debug and release modes. (Though there is of course more static data which means the binaries usually end up larger anyways.)

I also hacked together a implementation within a local version of rust-objc that implements this process for class! and sel!:

  • cocoa-foundation and cocoa both successfully built in debug and release modes.
  • winit also successfully built in both debug and release modes and ran correctly after a one-line fix.
  • I modified metal-rs to use this implementation and the examples ran without any noticeable differences.

Hopefully that list of real-world crates like cocoa-foundation, cocoa, winit, and metal-rs working in my testing covers enough ground to say that my implementation works πŸŽ‰ (at least in some sense of the word). I highly doubt it is perfect and I expect there still to be holes (e.g. I haven't tried building for arm at all yet), but I'm satisfied with where I've gotten so far and I think that this exploration shows promise.

If it's still on the table, I'd love to start tinkering with getting this functionality into rust-objc (behind a feature flag) for everything that already depends on it! However, I have a few concerns about how the implementation might affect this project:

  1. A certain bug/quirk with how the Rust compiler works currently requires a workaround. This workaround seemed to be reliable enough during my testing, but there's a chance that a future compiler change could break it too. Is this crate is the right place to introduce that sort of possible instability?

  2. The implementation requires a proc macro to generate unique tokens for each class/selector reference. That currently means publishing a separate crate to house those proc macros. Would adding a helper crate for proc macros be a welcome change?

I don't think either of those are obvious deal-breakers, but I figure I should voice them now.

clavin avatar Apr 24 '21 08:04 clavin

A other small concern I have with using the special sections clang uses is how it impacts app startup time.

When used in clang, these sections are only included if the selector is used in the final app code or its libraries. When used with Rust, most apps will probably use crates like cocoa-foundation, that references a lot of selectors, most likely not used by the app. And last time I looked, these sections stay even if the selector is not used (they are no_dead_strip after all). That means that most Rust apps would likely contain many more unused selectors than a standard Objective-C (or Swift) app. As they are resolved at start time, that might impact startup time. Probably not by much but probably still worth checking.

vincentisambart avatar Apr 26 '21 00:04 vincentisambart

I personally am quite curious how you managed to merge the selectors from various rlib into a single section. It’s not obvious to me how that can be achieved even in a proc macro, would be really interesting to study it.

Startup time certainly could be an issue, but IMO could probably be resolved through feature flags at the class or framework level. I dunno if that’s acceptable for objc but I have support for it in another project and it performs well enough for my use.

drewcrawford avatar Apr 27 '21 03:04 drewcrawford

@vincentisambart Great thought! I didn't consider that yet, so here's what I found out looking into it:

First, I wrote up a super contrived example of what you were describing:

Contrived test case
unsafe fn not_called(obj: *mut objc::runtime::Object) {
    msg_void![obj, not_called];
}

unsafe fn can_be_optimized_out(obj: *mut objc::runtime::Object) {
    msg_void![obj, optimized_out];
}

fn main() {
    unsafe {
        let obj: *mut objc::runtime::Object = msg![class!(NSObject), new];

        if 1 != 1 {
            can_be_optimized_out(obj);
        }

        println!("{:016x}", obj as usize);
    }
}

There are two cases to note here: one where the function is never called at all, then another that is called but can be predictably optimized out.

In a debug build, all three selectors are present in the binary:

Disassembly of section __TEXT,__objc_methname:

10003b769: 6e 6f 74 5f 63 61 6c 6c 65 64 00              # "not_called\0"
10003b774: 6f 70 74 69 6d 69 7a 65 64 5f 6f 75 74 00     # "optimized_out\0"
10003b782: 6e 65 77 00                                   # "new\0"

That doesn't look very good. ❌

  • I find it very interesting that the not_called selector shows up here because the not_called function itself is completely absent from the binary output. I'll circle back on this later.

  • The optimized_out selector also shows up in the debug binary, however the can_be_optimized_out function was actually optimized out. That means that the optimized_out selector should have also been optimized out but wasn't. Again, I'll get back to this.

For comparison, a release build looks more like what you would expect:

Disassembly of section __TEXT,__objc_methname:

10003acb9: 6e 65 77 00     # "new\0"

The dead selectors are stripped from the binary in the release build. βœ…


To summarize the above section: I found that unused selectors were showing up in debug builds πŸ‘Ž, but were properly stripped out in release builds πŸ‘. That's a little bit of good news but still concerning.

Alright, so back to the debug build and its πŸ‘» ghost selectors. Looking at the assembly output from rustc, I did find some references to those selectors:

Assembly snippet
	.section	__TEXT,__objc_methname,cstring_literals
	.globl	L_OBJC_METH_VAR_NAME_not_called
L_OBJC_METH_VAR_NAME_not_called:
	.asciz	"not_called"

	.section	__DATA,__objc_selrefs,literal_pointers,no_dead_strip
	.globl	L_OBJC_SELECTOR_REFERENCES_not_called
	.p2align	3
L_OBJC_SELECTOR_REFERENCES_not_called:
	.quad	L_OBJC_METH_VAR_NAME_not_called

# ...

	.section	__DWARF,__debug_abbrev,regular,debug
    # ...
Ldebug_info_start0:
    # ...
	.quad	L_OBJC_SELECTOR_REFERENCES_not_called
    # ...

Rust's generated debug info includes the selector references, therefore causing those otherwise-unused selectors to show up in debug builds. This also explains why those selectors would disappear in release builds. I checked the debug info section for regular Objective-C code (with clang) and these symbols do not show up in the debug info sections.

A follow-up test also shows that Rust generates debug data for all statics in debug builds, even those that are never used1. However, even though this debug info is generated and shows up in the assembly output, the unused statics don't actually show up in the final binary2.

Not really sure what to do about this, but it does seem like this debug data could be a minor issue for debug builds.

Notes
  1. I literally inserted static SOMETHING: &str = "something staticky"; into the top scope and the string data as well as the static's symbol showed up in the assembly and debug data. I did not reference the static or do anything else.

  2. For example, if you had an unused 16MB static then your binary wouldn't be 16MB bigger because of it.

clavin avatar Apr 27 '21 05:04 clavin

@clavin can you put your implementation up some place so we can take a look at it?

jrmuizel avatar Apr 30 '21 15:04 jrmuizel

@jrmuizel Sure! Here's my quick hack (read: very dirty impl) in rust-objc: https://github.com/clavin/rust-objc/tree/static-refs. The interesting code is in proc_macros/src/lib.rs.

You can test it out on real-world projects that rely on the objc crate (e.g. winit or metal-rs) by changing out the dependency in Cargo.toml and enabling the static_references feature:

objc = { git = "https://github.com/clavin/rust-objc.git", branch = "static-refs", features = ["static_references"] }
# NOTE: the "static_references" feature must be enabled!

Note: you might have to also repeat this for upstream crates of whatever you're testing on (e.g. cocoa-foundation, cocoa, etc.) for it to work in those crates as well. You might also be able to use the [patch] section (if that supports enabling the static_references feature as well) but I haven't tried this since I usually had to also make small upstream patches as described below.

⚠️ Issues to beware of when testing with this prototype

When I was trying this prototype on other crates these were the issues I had to manually fix:

  • Any changes that have landed in the rust-objc repo but not yet published to crates.io, e.g. Encode
  • If the linker can't find a class symbol (OBJC_CLASS_$_MyClassName) then make sure the right framework is being linked in (with #[link(name = "something", kind = "framework"] extern {} or a build script) and that the class you're trying to use actually exists/is linkable (ran into this latter case once when testing metal-rs).
  • Static class references aren't exactly the same as the current implementation, namely that they aren't registered by the runtime (ish). This is fine for sending messages, but not fine if you want to do something like dynamically create a class that extends NSWindow (like winit does). In this case, simply send a + [class] message to get a registered class reference: msg_send![class!(NSWindow), class] instead of just class!(NSWindow).
    • I believe this more closely resembles real Objective-C code where you can't get an optimized static class reference without something like a + [class] message.
    • I know that I could make the class! macro return a different type (e.g. &'static objc::UnregisteredClass) to avoid this issue, I just don't want to get ahead of myself on this prototype.

If you run into an issue that isn't covered above, either I forgot to include it here or I haven't run into it. 🀞 I'm hoping this prototype isn't just a "it works on my machine" situation.


Just to add: I'm still exploring other methods of implementing this functionality, like trying to see if I can get Clang to do the magic instead of emulating it, or finding something better than proc macros for providing this functionality. This prototype is just the simplest implementation I have figured out so far (or might be the only way to do it currently, idk yet 🀷).

clavin avatar Apr 30 '21 22:04 clavin

So I can't comment on what you really want to know, which is "is this acceptable for the crate". However, I have been under-the-table working on my own objc crate (partly due to this issue). So I can vouch for the fact that it's a problem that may justify significant changes, and I have reasonably informed opinions on what a good implementation looks like.

This looks like a good implementation to me.

I'm still exploring other methods of implementing this functionality, like trying to see if I can get Clang to do the magic

I believe you are looking for this. However it's not immediately clear to me how to use this from Rust (does Rust even use clang? Or is it just llvm?)

On the other hand there appear to annotations for this in the LLVM IR so I might assume that if you could get something injected into the IR it would work. But it seems IR control is not supported in rustc.

From what I can see procmacros are the best solution for this at present.


One thing I will nitpick though is the hash implementation. If you disassemble a real objc executable, you will see one entry per selector, even if dozens of libs all use init, they don't need a unique symbol. However rustc rejects the idea of duplicate symbols across a link so we have some difficult choices for how to handle this. The solution used here is to hash the point-of-use to give it a unique symbol, however a) this increases codesize due to multiple symbols and b) this increases codesize due to the size of the hash, c) I'm a little uncertain about the stability of this approach, in spite of the fact that it doesn't use nightly APIs.

For b, I think it would be ok to go down from 16 to 10-11 hex characters, based on a quick birthday attack. But am I right in thinking we have the full 0-255 range here, in which case 5-6 full bytes should be sufficient?

However this still leaves a. The solution I have been toying with is having the user supply a prefix for the symbol rather than rely on a hash. As a result, crates that are interested in coordinating how to share selectors among each other can pick the same prefix, and get tighter code, whereas crates not wanting to bother with all that can use a prefix like crate-name and get unique symbols. I think this might be a better tradeoff between performance and flexibility, although it does require a bit more work on the user.


Overall though I like this patch, I'm glad there's momentum building behind getting this done.

drewcrawford avatar May 02 '21 21:05 drewcrawford

I don't think you should be trying to get Clang / LLVM to do the work - the fact that rustc is based on that is supposed to be an implementation detail, and reimplementations like mrustc should be able to coexist.

But overall, really nice work, if you throw a PR at some point when you get further, I'd be happy to review it!

madsmtm avatar May 26 '21 21:05 madsmtm

I've updated my version of this using some of the tricks https://github.com/clavin/rust-objc/tree/static-refs. It seems to work pretty well and doesn't seem to need any of the ident hashing. I also dropped the no_dead_strip section attribute from the __objc_selrefs section which helps avoid including unused selectors when linking.

The biggest remaining disadvantage that I can see is that the do_it helper function will not get inlined to generic functions used across crate boundaries unless LTO is turned on.

jrmuizel avatar Sep 30 '21 01:09 jrmuizel

I've included this in my fork objc2, see https://github.com/madsmtm/objc2/pull/104 and code objc2/src/macros.rs#L33-L329. I've tried to include git attribution where possible, really couldn't have done it without all of the amazing work in here(!), let me know if you're unhappy with any part of that / would like to be attributed differently.

Notable differences from @clavin's version:

  • Renamed feature flag to "unstable-static-sel" to signify the (in)stability of this. Will probably stay like this for a while, but downstream consumers would be able to try it out and see if it improves their performance!
  • Uses declarative macros for everything but the ident hashing (creating the selector name is possible with const now)!
  • Uses UnsafeCell::get instead of ptr::read_volatile, since that allows the compiler to strip more forms of unused access.
  • Added "unstable-static-sel-inlined" feature flag to allow bypassing the #[inline(never)] workaround (works if you're using LTO, or cranelift, and perhaps other codegen backends?).
  • Works on simulator targets (which needs proper image info) and x86 macOS (uses different link sections).
  • ~Doesn't include static classes (yet).~ Done since https://github.com/madsmtm/objc2/pull/185.

Haven't looked much at dead stripping of the final binaries, but I suspect it's the same as what @clavin ended up with (debug bad, release good). I included no_dead_strip on __objc_selrefs because some LLVM sources seemed to expect this, though I may be wrong here. Could be interesting to open a PR to change this in clang, and see how that goes!

madsmtm avatar Jun 25 '22 21:06 madsmtm

My three cents on some of the still open questions here.

Stability of ident hashing: This is also done in the defmt crate, so at least it's not unheard of, although it would be nice if spans implemented Hash directly to explicitly allow this. I've opened an internals thread to discuss getting something stable into Rust.

Stability of #[inline(never)] workaround: It is currently documented in the dev guide that non-generic functions and statics end up in the same codegen unit, but that's in no way normative and may stop working at any time.

Optimizing hashing: Maybe it would be possible to have an atomically increasing counter in the proc macro somehow? That would for sure give the smallest names! Though maybe has problems with no longer producing reproducible binaries?

Allowing selector sharing: Wouldn't that be possible by having a crate e.g. objc-selectors that just uses sel! for a bunch of different selectors, each use in a different function - then a custom msg_send! could use selectors from those functions instead? I don't think this needs special support from objc?

madsmtm avatar Jun 25 '22 21:06 madsmtm