ruby-vips
ruby-vips copied to clipboard
OS X 10.13.3 exposes bad fork() issue
After updating to OS X 10.13.3, I received the following error trying to require 'vips':
objc[25602]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[25602]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Some googling took me to this post, which explains that the error comes from “incorrect code that happened to work most of the time in the past.” I have no way of knowing whether the issue is in ruby-vips or ffior libvips itself, and I don't really have the time or knowledge to track it down. Using one of that page's suggested workarounds – setting OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES in the environment – allowed me to load ruby-vips without issue and get on with my work, though I realize I'm just asking OS X to hang later if it encounters a deadlock instead of crashing the process now.
I'm reporting it here so there's both an issue to track and a concise explanation and workaround for anyone else who happens to google this error in a vips context.
Thank you for reporting this!
I'll see if I can reproduce the problem.
Specifics:
OS X High Sierra 10.13.3 (17D102)
Homebrew 1.5.4-73-g1811c77
Homebrew/homebrew-core (git revision 7679; last commit 2018-02-21)
libvips/vips 8.6.2-Wed Jan 31 20:39:14 UTC 2018 (installed via homebrew)
rbenv 1.1.1 (installed via homebrew)
ruby-build 20171226 (installed via homebrew)
I've recreated the behavior under the following rubies (both installed via ruby-build):
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
ruby 2.3.5p376 (2017-09-14 revision 59905) [x86_64-darwin17]
If there's any other information that might help, please let me know.
I tried here and it seems to work for me. Perhaps some other setting is triggering this? I see:
john@katamata ~ $ vips --version
vips-8.6.2-Wed Jan 31 20:39:14 UTC 2018
john@katamata ~ $ irb --version
irb 0.9.6(09/06/30)
john@katamata ~ $ ruby --version
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin17]
john@katamata ~ $ irb
irb(main):001:0> require 'vips'
=> true
irb(main):002:0> x = Vips::Image.black 10, 10
=> #<Image 10x10 uchar, 1 bands, b-w>
irb(main):003:0>
That's with brew update then brew upgrade.
Also with 13.3, of course:
john@katamata ~ $ uname -a
Darwin katamata.local 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64
OK. I'll keep digging around and see what else it might be. Thanks for looking into it!
After a full day of investigations, rebuilds, reboots, and tests, none of which fixed the issue, it seems to have gone away on its own. Sorry for the wild goose chase.
How frustrating! I'm glad you resolved it anyway.
It looks like some people are still being bitten by this, eg.: https://github.com/libvips/ruby-vips/issues/314#issuecomment-951072441
I've not been able to reproduce this on my mac, so I can't really help. If anyone can make a simple test case that shows the problem, I'll try to investigate.
Hi everyone,
I encounter some problem when I use ruby-vips with puma multi-workers (no problem with non-forked puma)
Those lines appears in my console when I load a page from a Rails controller/view and a concurrent variant processing (active storage default setup with vips)
objc[34825]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[34825]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Use of env var OBJC_DISABLE_INITIALIZE_FORK_SAFETY=yes fix it
My stack :
- Mac OS : 11.6
- xcode-select version : 2384
- rbenv : 1.2.0
- ruby-build : 20211019
- libvips : vips-8.11.4
My only insight her is that +[__NSCFConstantString initialize] is an Objective-C-ism. It looks like the objc runtime was half-way through building a string when the process was forked, leaving the child process in an inconsistent state.
libvips does not use any Objective C directly, and does not fork, so I'm not certain this error is down to libvips. Perhaps one of the (many) libvips dependencies (perhaps fontconfig?) is bringing in the objc runtime indirectly?
The fork is presumably puma starting up the worker pool. It'll be running require "vips", then forking some time after that. when the require runs, it'll start up libvips. That should all be synchronous, ie. when the require returns, libvips ought to be inited. But perhaps there is some background processing in some dependency afterwards that then gets interrupted by the fork?
If that's the case, sleeping for a second or so after require "vips" might allow libvips startup to complete.
But this all sounds unlikely, and it's only a guess.
I found this interesting article on the error:
https://blog.phusion.nl/2017/10/13/why-ruby-app-servers-break-on-macos-high-sierra-and-what-can-be-done-about-it/
tldr: It sounds like you might be trying to do some processing with ruby-vips before the puma fork. The fix is ... don't!
libvips won't make any threads on require. You need to not make any ruby-vips calls until the worker process has started.
I have seen this error on MacOS when creating text images/labels to be applied to other images. I sidestepped it by creating a docker image to test the annotations, knowing that the production system is not a mac.
I can probably create a test case...
I read the article @jcupitt referred to at the time. Without delving too deeply into Puma/forks I don't think it was the timing of the library being loaded. The image annotation is one of the final things that might happen to an image.
This is a typical dragonfly job object:
[
["f","filename.JPG"],
["p","thumb", "200x200"],
["p","caption", {
"copyright": "© example.com.au 2021",
"photo_id":"Photo: 7591",
"layout":"vertical",
"position":"bottom left"}
]
]
which sets up a sequence of actions:
- get the file called "filename.JPG"
- process the result with the processor called thumb
- process the result with the processor called caption (and its options)
It's during step 3 that the error occurs. libvips has already been in use for step 2.
Caption itself:
- calls :Vips::Image.new_from_file (output of step 2, above)
- creates a text image from caption options .... 💥 +[UIFontDescriptor initialize] 💥
- abandons the process..
Caveat: that is my mental model of how it works - but I've also followed it and debugged parts of caption and thumb
I'm seeing this in a Rails 6.1 application that's using image_processing (0.12.1) and ruby-vips (2.1.4) with puma (5.6.2) . I've attached LLDB to the puma child worker process and some lightweight debugging shows...
Where is the crash coming from?
It looks like when simply loading libvips, before we even have a chance to call vips_init. Perhaps the ObjC runtime is activated as part of the dynamic loading process?
C
* frame #0: 0x00000001b5c97eb8 libsystem_kernel.dylib`__abort_with_payload + 8
frame #1: 0x00000001b5c9a864 libsystem_kernel.dylib`abort_with_payload_wrapper_internal + 104
frame #2: 0x00000001b5c9a7fc libsystem_kernel.dylib`abort_with_reason + 32
frame #3: 0x00000001b5b59d5c libobjc.A.dylib`_objc_fatalv(unsigned long long, unsigned long long, char const*, char*) + 120
frame #4: 0x00000001b5b59ce4 libobjc.A.dylib`_objc_fatal(char const*, ...) + 44
frame #5: 0x00000001b5b4e42c libobjc.A.dylib`performForkChildInitialize(objc_class*, objc_class*) + 400
frame #6: 0x00000001b5b355dc libobjc.A.dylib`initializeNonMetaClass + 672
frame #7: 0x00000001b5b34f80 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 156
frame #8: 0x00000001b5b34d3c libobjc.A.dylib`lookUpImpOrForward + 1016
frame #9: 0x00000001b5b34664 libobjc.A.dylib`_objc_msgSend_uncached + 68
frame #10: 0x00000001b6be6a68 Foundation`NSClassFromString + 64
frame #11: 0x00000001b888a7f8 AppKit`+[NSColor(NSUIKitSupport) load] + 44
frame #12: 0x00000001b5b37508 libobjc.A.dylib`load_images + 1088
frame #13: 0x00000001009839a8 dyld`dyld4::RuntimeState::notifyObjCInit(dyld4::Loader const*) + 164
frame #14: 0x0000000100989cf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 196
frame #15: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
frame #16: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
frame #17: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
frame #18: 0x0000000100989dcc dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124
frame #19: 0x0000000100999734 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 512
frame #20: 0x000000010423a564 ffi_c.bundle`library_initialize + 220
frame #21: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232
frame #22: 0x00000001010b5844 libruby.3.0.dylib`vm_sendish + 1236
frame #23: 0x000000010109cdd0 libruby.3.0.dylib`vm_exec_core + 11996
frame #24: 0x00000001010b137c libruby.3.0.dylib`rb_vm_exec + 2764
... snipped ...
Ruby
... snipped ...
from ~/Projects/the-app/app/models/avatar.rb:4:in `<main>'
from ~/.rvm/gems/ruby-3.0.2/gems/zeitwerk-2.5.4/lib/zeitwerk/kernel.rb:35:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'
from ~/.rvm/gems/ruby-3.0.2/gems/ruby-vips-2.1.4/lib/vips.rb:572:in `<main>'
from ~/.rvm/gems/ruby-3.0.2/gems/ruby-vips-2.1.4/lib/vips.rb:579:in `<module:Vips>'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:99:in `ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:99:in `map'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:109:in `block in ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:109:in `each'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:112:in `block (2 levels) in ffi_lib'
from ~/.rvm/gems/ruby-3.0.2/gems/ffi-1.15.5/lib/ffi/library.rb:112:in `open'
Where is fork called?
It appears to be a part of opening a pipe in the rb-fsevent (0.11.0). I think, but am not sure, that this is related to watching for file modifications during development.
Possible experiment: try eager loading to see if that is a workaround.
C
* thread #23, name = 'listen-worker_thread', stop reason = breakpoint 1.1
* frame #0: 0x00000001b5b93f74 libsystem_c.dylib`fork
frame #1: 0x0000000100fda7ec libruby.3.0.dylib`fork_check_err + 664
frame #2: 0x0000000100f4c180 libruby.3.0.dylib`pipe_open + 480
frame #3: 0x0000000100f4bf20 libruby.3.0.dylib`rb_io_popen + 412
frame #4: 0x0000000100f509bc libruby.3.0.dylib`rb_io_s_popen + 176
frame #5: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232
frame #6: 0x00000001010b5844 libruby.3.0.dylib`vm_sendish + 1236
frame #7: 0x000000010109cdd0 libruby.3.0.dylib`vm_exec_core + 11996
frame #8: 0x00000001010b137c libruby.3.0.dylib`rb_vm_exec + 2764
frame #9: 0x00000001010af0f0 libruby.3.0.dylib`rb_vm_invoke_proc + 924
frame #10: 0x000000010106dcb8 libruby.3.0.dylib`thread_do_start_proc + 732
frame #11: 0x000000010106d570 libruby.3.0.dylib`thread_start_func_2 + 1180
frame #12: 0x000000010106cf68 libruby.3.0.dylib`thread_start_func_1 + 272
frame #13: 0x00000001b5ca9240 libsystem_pthread.dylib`_pthread_start + 148
Ruby
from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/thread.rb:18:in `block in new'
from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/thread.rb:26:in `rescue_and_log'
from ~/.rvm/gems/ruby-3.0.2/gems/listen-3.7.1/lib/listen/adapter/darwin.rb:49:in `block in _run'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:39:in `run'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:140:in `open_pipe'
from ~/.rvm/gems/ruby-3.0.2/gems/rb-fsevent-0.11.0/lib/rb-fsevent/fsevent.rb:140:in `popen'
Hi @shepmaster ,
That C stacktrace is interesting:
abort()
...
frame #17: 0x0000000100989cdc dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 172
frame #18: 0x0000000100989dcc dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 124
frame #19: 0x0000000100999734 dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 512
frame #20: 0x000000010423a564 ffi_c.bundle`library_initialize + 220
frame #21: 0x00000001010bc404 libruby.3.0.dylib`vm_call_cfunc_with_frame + 232
So the dlopen() is triggering the initializers in the library being opened, and they in turn are (indirectly) starting up the NS classes.
I think I've found the line of code:
https://github.com/ffi/ffi/blob/master/ext/ffi_c/DynamicLibrary.c#L119
I would guess that's being called from here:
https://github.com/libvips/ruby-vips/blob/master/lib/vips.rb#L45
There are three lines like that in vips.rb, starting up libglib, libgobject and libvips. It'll probably be one of them.
Maybe it's possible to add some puts to try to see which of the three it is?
We can't do much about the first two, but if it's libvips, it's probably one of the optional packages that libvips depends on, and we could maybe remove it on macos.
By far the largest optional dependency for libvips is imagemagick. I'd be tempted to try building libvips without that, it could be the cause.
So the
dlopen()is triggering the initializers in the library being opened, and they in turn are (indirectly) starting up the NS classes.
Yes, I agree.
Maybe it's possible to add some puts to try to see which of the three it is?
Via printf-debugging, it appears to be:
https://github.com/libvips/ruby-vips/blob/3206143db61ac63f4689efc5a7cad83d92d4c93a/lib/vips.rb?rgh-link-date=2022-02-21T19%3A31%3A38Z#L573
This makes some sense — as you mentioned:
Perhaps one of the (many) libvips dependencies [...] is bringing in the objc runtime indirectly?
one of the optional packages that libvips depends on
Since the stack trace mentions AppKit`+[NSColor(NSUIKitSupport) load], I tried to find all the recursive dependencies of libvips that lead to that. My script is super hacky, but does point to libgio-2.0.0.dylib as requiring AppKit.
% otool -L /opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib
/opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib:
/opt/homebrew/opt/glib/lib/libgio-2.0.0.dylib (compatibility version 7001.0.0, current version 7001.4.0)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1856.105.0)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1856.105.0)
/System/Library/Frameworks/AppKit.framework/Versions/C/AppKit (compatibility version 45.0.0, current version 2113.20.111)
libgio is used in many many places, as I'm sure you know, but a few:
- libgdk_pixbuf
- libopenslide
- libpangocairo
- librsvg
Hacky Ruby Dependency Script
$visited = {}
def step(fname)
output = `otool -L #{fname}`
libraries = output.lines.map { |line| line.chomp.split(' ').first }.reject { |lib| lib.include? ':' }.sort.uniq
libraries.each do |l|
unless $visited.key? l
$visited[l] = nil
$visited[l] = step(l)
end
end
end
def findem(lib)
$visited.each do |k, v|
if v.any? { |v| v.include? lib }
puts k
unless k == lib
findem(k)
end
end
end
end
step(ARGV[0])
findem('AppKit')
Ahh good detective work. I found this:
https://gitlab.gnome.org/GNOME/glib/-/blob/main/gio/meson.build#L384-392
So it seems that GIO pulls in appkit so it can use the macos settings framework, and I expect some other things too.
I'm not sure what the best solution is. At the moment, ruby-vips does some basic start up on require. I suppose this needs to be delayed until first use.
Maybe the best thing to do would be a very, very quick and dirty hack to delay startup, just to see if this does indeed fix the problem. Any volunteers?
I have a "fix" for our application: directly adding ruby-vips to our Gemfile.
My working theory is:
- Puma starts the coordinator thread
- Puma starts the worker threads
- Each worker thread requires everything in the Gemfile (we are not using application preloading)
- Rails -> Zeitwerk -> listen -> rb-fsevent opens a pipe which invokes
fork - A request that uses
image_processingcomes in
When we started out, we only added image_processing to our Gemfile. This will lazily load ruby-vips later on, when it's needed. However, that will occur after the fork and trigger the problematic ObjC initialization via the dlopen.
By explicitly adding ruby-vips to our Gemfile, it's loaded before the fork, so the check in the ObjC code won't be triggered.
This may be good enough for my team to move forward for now, but it certainly feels brittle. We have to hope that the forked process on the other end of the pipe never attempts to use libvips (unlikely in this case, thankfully).
At the moment, ruby-vips does some basic start up on
require. I suppose this needs to be delayed until first use.
Amusingly in my case, that start up needs to be executed before the first use, before the fork for the pipe, but after the fork for the Puma worker process.
Oh what a mess. So perhaps image_processing should eager-load ruby-vips on macos? Though (as you say) this sounds like it could break again easily.
@janko can I ping you on this issue (sorry)? Do you have an opinion?
The image_processing gem cannot load ruby-vips automatically, as this gem can be used with mini_magick as well. What I can recommend in the README is adding the following in the Gemfile (which should work in Rails):
gem "image_processing", require: "image_processing/vips"
I think it's unlikely that a Rails application will fork itself on boot, though it might when using Spring. It does seem like a safer default.
Thanks @janko! That sounds like a reasonable solution. Can anyone test this?
@jcupitt yeah @janko's solution fixed the issue for me. I don't use spring, but I do use foreman which may affect it (?)
- macOS 12.3.1 (Monterey)
- Ruby 3.1.0
- Rails 7.0.1
- image_processing gem 1.12.2
- vips 8.12.2 (installed via
brew)
Hmmm I just checked, and I have had
gem "image_processing", require: "image_processing/vips"
for some time, and I am still getting the error. However, I think that the processing I am doing (overlaying captions) isn't supplied by image_processing, so I can't expect it to help.
I am on MacOS 11.6.5 (20G527) (Big Sur), vips-8.12.2 Ruby 2.7.6, Rails 6.0.48
I submitted a PR to ffi to support DYLD_LIBRARY_{,FALLBACK_}PATH on mac. If a particular gem library author with vendored dependencies or a user knows where the correct dylib(s) live exactly, they could update one of those env vars, ':'-delimited. ffi-1.15.5 my machine was using would need a backported patch to work similarly.
Gemfile yakshaving to fix ActiveStorage on mac:
if RUBY_PLATFORM =~ /darwin/
normal_homebrew = (RUBY_PLATFORM =~ /\Aarm64-/ ? '/opt/homebrew' : '/usr/local')
if `brew --prefix`.chop != normal_homebrew
libs = `brew --prefix glib vips`.each_line.map { _1.chop + '/lib' }.join(':')
if ENV.has_key?('DYLD_FALLBACK_LIBRARY_PATH')
ENV['DYLD_FALLBACK_LIBRARY_PATH'] += ":#{libs}"
else
ENV['DYLD_FALLBACK_LIBRARY_PATH'] = libs
end
gem 'ffi', github: 'steakknife/ffi', branch: 'fix-155-backport', submodules: true
end
end
# The following wasn't needed
# gem "image_processing", "~> 1.2", require: "image_processing/vips"
Homebrew can be used without sudo by setting HOMEBREW_PREFIX and HOMEBREW_CELLAR, but the problem lies in gems and other assumptions about library locations.
I was hitting this issue with puma workers (fork) and was able to resolve it with the following in my puma.rb:
before_fork do
# ensure native libraries are initialized before forking
if RUBY_PLATFORM =~ /darwin/
require 'vips'
end
end