ruby-vips OS X 10.13.3 exposes bad fork() issue

After updating to OS X 10.13.3, I received the following error trying to require 'vips':

objc[25602]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[25602]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

Some googling took me to this post, which explains that the error comes from “incorrect code that happened to work most of the time in the past.” I have no way of knowing whether the issue is in ruby-vips or ffior libvips itself, and I don't really have the time or knowledge to track it down. Using one of that page's suggested workarounds – setting OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES in the environment – allowed me to load ruby-vips without issue and get on with my work, though I realize I'm just asking OS X to hang later if it encounters a deadlock instead of crashing the process now.

I'm reporting it here so there's both an issue to track and a concise explanation and workaround for anyone else who happens to google this error in a vips context.

Feb 21 '18 17:02 mbklein

Thank you for reporting this!

I'll see if I can reproduce the problem.

Feb 21 '18 17:02 jcupitt

Specifics:

OS X High Sierra 10.13.3 (17D102)
Homebrew 1.5.4-73-g1811c77
Homebrew/homebrew-core (git revision 7679; last commit 2018-02-21)
libvips/vips 8.6.2-Wed Jan 31 20:39:14 UTC 2018 (installed via homebrew)
rbenv 1.1.1 (installed via homebrew)
ruby-build 20171226 (installed via homebrew)

I've recreated the behavior under the following rubies (both installed via ruby-build):

ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
ruby 2.3.5p376 (2017-09-14 revision 59905) [x86_64-darwin17]

If there's any other information that might help, please let me know.

Feb 21 '18 18:02 mbklein

I tried here and it seems to work for me. Perhaps some other setting is triggering this? I see:

john@katamata ~ $ vips --version
vips-8.6.2-Wed Jan 31 20:39:14 UTC 2018
john@katamata ~ $ irb --version
irb 0.9.6(09/06/30)
john@katamata ~ $ ruby --version
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
john@katamata ~ $ irb
irb(main):001:0> require 'vips'
=> true
irb(main):002:0> x = Vips::Image.black 10, 10
=> #<Image 10x10 uchar, 1 bands, b-w>
irb(main):003:0>

That's with brew update then brew upgrade.

Feb 21 '18 23:02 jcupitt

Also with 13.3, of course:

john@katamata ~ $ uname -a
Darwin katamata.local 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64

Feb 21 '18 23:02 jcupitt

OK. I'll keep digging around and see what else it might be. Thanks for looking into it!

Feb 22 '18 02:02 mbklein

After a full day of investigations, rebuilds, reboots, and tests, none of which fixed the issue, it seems to have gone away on its own. Sorry for the wild goose chase.

Feb 22 '18 15:02 mbklein

How frustrating! I'm glad you resolved it anyway.

Feb 22 '18 16:02 jcupitt

It looks like some people are still being bitten by this, eg.: https://github.com/libvips/ruby-vips/issues/314#issuecomment-951072441

I've not been able to reproduce this on my mac, so I can't really help. If anyone can make a simple test case that shows the problem, I'll try to investigate.

Oct 26 '21 08:10 jcupitt

Hi everyone,

I encounter some problem when I use ruby-vips with puma multi-workers (no problem with non-forked puma)

Those lines appears in my console when I load a page from a Rails controller/view and a concurrent variant processing (active storage default setup with vips)

objc[34825]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[34825]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

Use of env var OBJC_DISABLE_INITIALIZE_FORK_SAFETY=yes fix it

My stack :

Mac OS : 11.6
xcode-select version : 2384
rbenv : 1.2.0
ruby-build : 20211019
libvips : vips-8.11.4

Oct 26 '21 08:10 kevynlebouille

My only insight her is that +[__NSCFConstantString initialize] is an Objective-C-ism. It looks like the objc runtime was half-way through building a string when the process was forked, leaving the child process in an inconsistent state.

libvips does not use any Objective C directly, and does not fork, so I'm not certain this error is down to libvips. Perhaps one of the (many) libvips dependencies (perhaps fontconfig?) is bringing in the objc runtime indirectly?

The fork is presumably puma starting up the worker pool. It'll be running require "vips", then forking some time after that. when the require runs, it'll start up libvips. That should all be synchronous, ie. when the require returns, libvips ought to be inited. But perhaps there is some background processing in some dependency afterwards that then gets interrupted by the fork?

If that's the case, sleeping for a second or so after require "vips" might allow libvips startup to complete.

But this all sounds unlikely, and it's only a guess.

Oct 26 '21 08:10 jcupitt

I found this interesting article on the error:

https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/

tldr: It sounds like you might be trying to do some processing with ruby-vips before the puma fork. The fix is ... don't!

libvips won't make any threads on require. You need to not make any ruby-vips calls until the worker process has started.

Oct 26 '21 08:10 jcupitt

I have seen this error on MacOS when creating text images/labels to be applied to other images. I sidestepped it by creating a docker image to test the annotations, knowing that the production system is not a mac.

I can probably create a test case...

I read the article @jcupitt referred to at the time. Without delving too deeply into Puma/forks I don't think it was the timing of the library being loaded. The image annotation is one of the final things that might happen to an image.

This is a typical dragonfly job object:

[
  ["f","filename.JPG"],
  ["p","thumb", "200x200"],
  ["p","caption", {
       "copyright": "© example.com.au 2021",
       "photo_id":"Photo: 7591",
       "layout":"vertical", 
       "position":"bottom left"}
   ]
]

which sets up a sequence of actions:

get the file called "filename.JPG"
process the result with the processor called thumb
process the result with the processor called caption (and its options)

It's during step 3 that the error occurs. libvips has already been in use for step 2.

Caption itself:

calls :Vips::Image.new_from_file (output of step 2, above)
creates a text image from caption options .... 💥 +[UIFontDescriptor initialize] 💥
abandons the process..

Caveat: that is my mental model of how it works - but I've also followed it and debugged parts of caption and thumb

Nov 03 '21 02:11 anitagraham

I'm seeing this in a Rails 6.1 application that's using image_processing (0.12.1) and ruby-vips (2.1.4) with puma (5.6.2) . I've attached LLDB to the puma child worker process and some lightweight debugging shows...

Where is the crash coming from?

It looks like when simply loading libvips, before we even have a chance to call vips_init. Perhaps the ObjC runtime is activated as part of the dynamic loading process?

C

* frame #0: 0x00000001b5c97eb8 libsystem_kernel.dylib`__abort_with_payload + 8
    frame #1: 0x00000001b5c9a864 libsystem_kernel.dylib`abort_with_payload_wrapper_internal + 104
    frame #2: 0x00000001b5c9a7fc libsystem_kernel.dylib`abort_with_reason + 32
    frame #3: 0x00000001b5b59d5c libobjc.A.dylib`_objc_fatalv(unsigned long long, unsigned long long, char const*, char*) + 120
    frame #4: 0x00000001b5b59ce4 libobjc.A.dylib`_objc_fatal(char const*, ...) + 44
    frame #5: 0x00000001b5b4e42c libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 400
    frame #6: 0x00000001b5b355dc libobjc.A.dylib`initializeNonMetaClass + 672
    frame #7: 0x00000001b5b34f80 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 156
    frame #8: 0x00000001b5b34d3c libobjc.A.dylib`lookUpImpOrForward + 1016
    frame #9: 0x00000001b5b34664 libobjc.A.dylib`_objc_msgSend_uncached + 68
    frame #10: 0x00000001b6be6a68 Foundation`NSClassFromString + 64
    frame #11: 0x00000001b888a7f8 AppKit`+[NSColor(NSUIKitSupport) load] + 44
    frame #12: 0x00000001b5b37508 libobjc.A.dylib`load_images + 1088
    frame #13: 0x00000001009839a8 dyld`dyld4::RuntimeState::notifyObjCInit(dyld4::Loader const*) + 164
    frame #14: 0x0000000100989cf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 196
    frame #15: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
    frame #16: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
    frame #17: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
    frame #18: 0x0000000100989dcc dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124
    frame #19: 0x0000000100999734 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 512
    frame #20: 0x000000010423a564 ffi_c.bundle`library_initialize + 220
    frame #21: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232
    frame #22: 0x00000001010b5844 libruby.3.0.dylib`vm_sendish + 1236
    frame #23: 0x000000010109cdd0 libruby.3.0.dylib`vm_exec_core + 11996
    frame #24: 0x00000001010b137c libruby.3.0.dylib`rb_vm_exec + 2764
    ... snipped ...

Ruby

... snipped ...
from ~/Projects/the-app/app/models/avatar.rb:4:in `<main>'
from ~/.rvm/gems/ruby-3.0.2/gems/zeitwerk-2.5.4/lib/zeitwerk/kernel.rb:35:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/ruby-vips-2.1.4/lib/vips.rb:572:in `<main>'
from ~/.rvm/gems/ruby-3.0.2/gems/ruby-vips-2.1.4/lib/vips.rb:579:in `<module:Vips>'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:99:in `ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:99:in `map'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:109:in `block in ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:109:in `each'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:112:in `block (2 levels) in ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:112:in `open'

Where is `fork` called?

It appears to be a part of opening a pipe in the rb-fsevent (0.11.0). I think, but am not sure, that this is related to watching for file modifications during development.

Possible experiment: try eager loading to see if that is a workaround.

C

* thread #23, name = 'listen-worker_thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001b5b93f74 libsystem_c.dylib`fork
    frame #1: 0x0000000100fda7ec libruby.3.0.dylib`fork_check_err + 664
    frame #2: 0x0000000100f4c180 libruby.3.0.dylib`pipe_open + 480
    frame #3: 0x0000000100f4bf20 libruby.3.0.dylib`rb_io_popen + 412
    frame #4: 0x0000000100f509bc libruby.3.0.dylib`rb_io_s_popen + 176
    frame #5: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232
    frame #6: 0x00000001010b5844 libruby.3.0.dylib`vm_sendish + 1236
    frame #7: 0x000000010109cdd0 libruby.3.0.dylib`vm_exec_core + 11996
    frame #8: 0x00000001010b137c libruby.3.0.dylib`rb_vm_exec + 2764
    frame #9: 0x00000001010af0f0 libruby.3.0.dylib`rb_vm_invoke_proc + 924
    frame #10: 0x000000010106dcb8 libruby.3.0.dylib`thread_do_start_proc + 732
    frame #11: 0x000000010106d570 libruby.3.0.dylib`thread_start_func_2 + 1180
    frame #12: 0x000000010106cf68 libruby.3.0.dylib`thread_start_func_1 + 272
    frame #13: 0x00000001b5ca9240 libsystem_pthread.dylib`_pthread_start + 148

Ruby

from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/thread.rb:18:in `block in new'
from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/thread.rb:26:in `rescue_and_log'
from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/adapter/darwin.rb:49:in `block in _run'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:39:in `run'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:140:in `open_pipe'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:140:in `popen'

Feb 21 '22 18:02 shepmaster

Hi @shepmaster ,

That C stacktrace is interesting:

abort()
...
    frame #17: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
    frame #18: 0x0000000100989dcc dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124
    frame #19: 0x0000000100999734 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 512
    frame #20: 0x000000010423a564 ffi_c.bundle`library_initialize + 220
    frame #21: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232

So the dlopen() is triggering the initializers in the library being opened, and they in turn are (indirectly) starting up the NS classes.

I think I've found the line of code:

https://github.com/ffi/ffi/blob/master/ext/ffi_c/DynamicLibrary.c#L119

I would guess that's being called from here:

https://github.com/libvips/ruby-vips/blob/master/lib/vips.rb#L45

There are three lines like that in vips.rb, starting up libglib, libgobject and libvips. It'll probably be one of them.

Maybe it's possible to add some puts to try to see which of the three it is?

We can't do much about the first two, but if it's libvips, it's probably one of the optional packages that libvips depends on, and we could maybe remove it on macos.

By far the largest optional dependency for libvips is imagemagick. I'd be tempted to try building libvips without that, it could be the cause.

Feb 21 '22 19:02 jcupitt

So the dlopen() is triggering the initializers in the library being opened, and they in turn are (indirectly) starting up the NS classes.

Yes, I agree.

Maybe it's possible to add some puts to try to see which of the three it is?

Via printf-debugging, it appears to be:

https://github.com/libvips/ruby-vips/blob/3206143db61ac63f4689efc5a7cad83d92d4c93a/lib/vips.rb?rgh-link-date=2022-02-21T19%3A31%3A38Z#L573

This makes some sense — as you mentioned:

Perhaps one of the (many) libvips dependencies [...] is bringing in the objc runtime indirectly?

one of the optional packages that libvips depends on

Since the stack trace mentions AppKit`+[NSColor(NSUIKitSupport) load], I tried to find all the recursive dependencies of libvips that lead to that. My script is super hacky, but does point to libgio-2.0.0.dylib as requiring AppKit.

% otool -L /opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib
/opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib:
	/opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib (compatibility version 7001.0.0, current version 7001.4.0)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1856.105.0)
	/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1856.105.0)
	/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit (compatibility version 45.0.0, current version 2113.20.111)

libgio is used in many many places, as I'm sure you know, but a few:

libgdk_pixbuf
libopenslide
libpangocairo
librsvg

Hacky Ruby Dependency Script

$visited = {}

def step(fname)
  output = `otool -L #{fname}`
  libraries = output.lines.map { |line| line.chomp.split(' ').first }.reject { |lib| lib.include? ':' }.sort.uniq

  libraries.each do |l|
    unless $visited.key? l
      $visited[l] = nil
      $visited[l] = step(l)
    end
  end
end


def findem(lib)
  $visited.each do |k, v|
    if v.any? { |v| v.include? lib }
      puts k
      unless k == lib
        findem(k)
      end
    end
  end
end

step(ARGV[0])
findem('AppKit')

Feb 21 '22 21:02 shepmaster

Ahh good detective work. I found this:

https://gitlab.gnome.org/GNOME/glib/-/blob/main/gio/meson.build#L384-392

So it seems that GIO pulls in appkit so it can use the macos settings framework, and I expect some other things too.

I'm not sure what the best solution is. At the moment, ruby-vips does some basic start up on require. I suppose this needs to be delayed until first use.

Maybe the best thing to do would be a very, very quick and dirty hack to delay startup, just to see if this does indeed fix the problem. Any volunteers?

Feb 21 '22 22:02 jcupitt

I have a "fix" for our application: directly adding ruby-vips to our Gemfile.

My working theory is:

Puma starts the coordinator thread
Puma starts the worker threads
Each worker thread requires everything in the Gemfile (we are not using application preloading)
Rails -> Zeitwerk -> listen -> rb-fsevent opens a pipe which invokes fork
A request that uses image_processing comes in

When we started out, we only added image_processing to our Gemfile. This will lazily load ruby-vips later on, when it's needed. However, that will occur after the fork and trigger the problematic ObjC initialization via the dlopen.

By explicitly adding ruby-vips to our Gemfile, it's loaded before the fork, so the check in the ObjC code won't be triggered.

This may be good enough for my team to move forward for now, but it certainly feels brittle. We have to hope that the forked process on the other end of the pipe never attempts to use libvips (unlikely in this case, thankfully).

At the moment, ruby-vips does some basic start up on require. I suppose this needs to be delayed until first use.

Amusingly in my case, that start up needs to be executed before the first use, before the fork for the pipe, but after the fork for the Puma worker process.

Feb 22 '22 02:02 shepmaster

Oh what a mess. So perhaps image_processing should eager-load ruby-vips on macos? Though (as you say) this sounds like it could break again easily.

@janko can I ping you on this issue (sorry)? Do you have an opinion?

Feb 22 '22 08:02 jcupitt

The image_processing gem cannot load ruby-vips automatically, as this gem can be used with mini_magick as well. What I can recommend in the README is adding the following in the Gemfile (which should work in Rails):

gem "image_processing", require: "image_processing/vips"

I think it's unlikely that a Rails application will fork itself on boot, though it might when using Spring. It does seem like a safer default.

Feb 28 '22 17:02 janko

Thanks @janko! That sounds like a reasonable solution. Can anyone test this?

Feb 28 '22 18:02 jcupitt

@jcupitt yeah @janko's solution fixed the issue for me. I don't use spring, but I do use foreman which may affect it (?)

macOS 12.3.1 (Monterey)
Ruby 3.1.0
Rails 7.0.1
image_processing gem 1.12.2
vips 8.12.2 (installed via brew)

May 11 '22 03:05 developius

Hmmm I just checked, and I have had

gem "image_processing", require: "image_processing/vips"

for some time, and I am still getting the error. However, I think that the processing I am doing (overlaying captions) isn't supplied by image_processing, so I can't expect it to help.

I am on MacOS 11.6.5 (20G527) (Big Sur), vips-8.12.2 Ruby 2.7.6, Rails 6.0.48

May 13 '22 09:05 anitagraham

I submitted a PR to ffi to support DYLD_LIBRARY_{,FALLBACK_}PATH on mac. If a particular gem library author with vendored dependencies or a user knows where the correct dylib(s) live exactly, they could update one of those env vars, ':'-delimited. ffi-1.15.5 my machine was using would need a backported patch to work similarly.

May 16 '23 00:05 skull-squadron

Gemfile yakshaving to fix ActiveStorage on mac:

if RUBY_PLATFORM =~ /darwin/
  normal_homebrew = (RUBY_PLATFORM =~ /\Aarm64-/ ? '/opt/homebrew' : '/usr/local')
  if `brew --prefix`.chop != normal_homebrew
    libs = `brew --prefix glib vips`.each_line.map { _1.chop + '/lib' }.join(':')
    if ENV.has_key?('DYLD_FALLBACK_LIBRARY_PATH')
      ENV['DYLD_FALLBACK_LIBRARY_PATH'] += ":#{libs}"
    else
      ENV['DYLD_FALLBACK_LIBRARY_PATH'] = libs
    end
    gem 'ffi', github: 'steakknife/ffi', branch: 'fix-155-backport', submodules: true
  end
end

# The following wasn't needed
# gem "image_processing", "~> 1.2", require: "image_processing/vips"

Homebrew can be used without sudo by setting HOMEBREW_PREFIX and HOMEBREW_CELLAR, but the problem lies in gems and other assumptions about library locations.

May 16 '23 00:05 skull-squadron

I was hitting this issue with puma workers (fork) and was able to resolve it with the following in my puma.rb:

before_fork do
  # ensure native libraries are initialized before forking
  if RUBY_PLATFORM =~ /darwin/
    require 'vips'
  end
end

Mar 20 '24 18:03 mikehale

ruby-vips ruby-vips copied to clipboard

OS X 10.13.3 exposes bad fork() issue

Where is the crash coming from?

C

Ruby

Where is fork called?

C

Ruby

ruby-vips
ruby-vips copied to clipboard

Where is `fork` called?