grpc
grpc copied to clipboard
grpc crash in a forked process (ruby)
What version of gRPC and what language are you using?
version: 1.37.1 language: ruby
What operating system (Linux, Windows,...) and version?
macos 11.3.1
What runtime / compiler are you using (e.g. python version or version of gcc)
ruby 2.7.3p183 (2021-04-05 revision 6847ee089d) [x86_64-darwin20]
What did you do?
Gemfile
# frozen_string_literal: true
source 'https://rubygems.org'
gem 'google-ads-googleads', require: 'google/ads/google_ads'
gem 'byebug'
main.rb
# frozen_string_literal: true
require 'google/ads/google_ads'
require 'byebug'
fork do
api = Google::Ads::GoogleAds::GoogleAdsClient.new do |config|
config.client_id = '<REDACTED>'
config.client_secret = '<REDACTED>'
config.developer_token = '<REDACTED>'
config.refresh_token = '<REDACTED>'
end
customer = api.service.customer
byebug
customer.list_accessible_customers
end
Process.waitall
What did you expect to see?
A successful response.
What did you see instead?
thomas@Thomass-MBP reproduce-google-ads-crash % bundle exec ruby main.rb
Could not create default logger. Check your config file.
[12, 21] in /Users/thomas/work/reproduce-google-ads-crash/main.rb
12: end
13:
14: customer = api.service.customer
15:
16: byebug
17:
=> 18: customer.list_accessible_customers
19: end
20:
21: Process.waitall
(byebug) break /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/active_call.rb:367
Created breakpoint 1 at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/active_call.rb:367
(byebug) continue
Stopped by breakpoint 1 at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/active_call.rb:367
[362, 371] in /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/active_call.rb
362: end
363: @metadata_sent = true
364: end
365:
366: begin
=> 367: batch_result = @call.run_batch(ops)
368: # no need to check for cancellation after a CallError because this
369: # batch contains a RECV_STATUS op
370: ensure
371: set_input_stream_done
(byebug) bt
--> #0 GRPC::ActiveCall.request_response(req#Google::Ads::GoogleAds::V7::Services::ListAccessibleCustomersRequest, metadata#Hash) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/active_call.rb:367
#1 block (2 levels) in #<Class:#<GRPC::ActiveCall::Operation:0x00007fc2df9659e8>>.block (2 levels) in request_response at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/client_stub.rb:174
#2 block in GRPC::InterceptionContext.block in intercept!(type#Symbol, args#Hash) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/interceptors.rb:181
#3 Google::Ads::GoogleAds::Interceptors::ErrorInterceptor.request_response(request#Google::Ads::GoogleAds::V7::Services::ListAccessibleCustomersRequest, call#GRPC::ActiveCall::InterceptableView, method#String, metadata#Hash) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/google-ads-googleads-11.0.0/lib/google/ads/google_ads/interceptors/error_interceptor.rb:65
#4 GRPC::InterceptionContext.intercept!(type#Symbol, args#Hash) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/interceptors.rb:175
#5 block in #<Class:#<GRPC::ActiveCall::Operation:0x00007fc2df9659e8>>.block in request_response at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/grpc-1.37.1-universal-darwin/src/ruby/lib/grpc/generic/client_stub.rb:173
#6 Gapic::ServiceStub::RpcCall.call(request#Google::Ads::GoogleAds::V7::Services::ListAccessibleCustomersRequest, options#Gapic::CallOptions) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/gapic-common-0.4.1/lib/gapic/grpc/service_stub/rpc_call.rb:121
#7 Gapic::ServiceStub.call_rpc(method_name#Symbol, request#Google::Ads::GoogleAds::V7::Services::ListAccessibleCustomersRequest, options#Gapic::CallOptions, &block#Proc) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/gapic-common-0.4.1/lib/gapic/grpc/service_stub.rb:156
#8 Google::Ads::GoogleAds::V7::Services::CustomerService::Client.list_accessible_customers(request#Google::Ads::GoogleAds::V7::Services::ListAccessibleCustomersRequest, options#Gapic::CallOptions) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/google-ads-googleads-11.0.0/lib/google/ads/google_ads/v7/services/customer_service/client.rb:370
ͱ-- #9 Kernel.public_send(*args) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/google-ads-googleads-11.0.0/lib/google/ads/google_ads/service_wrapper.rb:53
#10 Google::Ads::GoogleAds::ServiceWrapper.method_missing(name#Symbol, *args#Array, kwargs#Hash, &blk#NilClass) at /Users/thomas/.rbenv/versions/2.7.3/lib/ruby/gems/2.7.0/gems/google-ads-googleads-11.0.0/lib/google/ads/google_ads/service_wrapper.rb:53
#11 block in block in <main> at /Users/thomas/work/reproduce-google-ads-crash/main.rb:18
ͱ-- #12 Kernel.fork at /Users/thomas/work/reproduce-google-ads-crash/main.rb:6
#13 <main> at /Users/thomas/work/reproduce-google-ads-crash/main.rb:6
(byebug) step
objc[17108]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[17108]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Anything else we should know about your project / environment?
When it doesn't run in a forked process, the code works.
I originally opened a bug with the google-ads-googleads ruby gem at https://github.com/googleads/google-ads-ruby/issues/310 but since this looks like a grpc issue, I opened another bug here.
workaround
OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES bundle exec ruby main.rb
doesn't crash, but I don't think this is a good long term solution.
crash log
This cannot be reproduced with the extension built on macOS 11.4 and run in 11.4. It’s possible that there are fixes in both the compiler/library and the macOS. No idea the versions in CI environment, it however may help if we upgrade the CI macOS version to have this fixed for least newer macOS.
Crash log with debug info:
Application Specific Information:
objc[44251]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
crashed on child side of fork pre-exec
...
Thread 6 Crashed:
0 libsystem_kernel.dylib
0x00007fff5f096016 __abort_with_payload + 10
1 libsystem_kernel.dylib
0x00007fff5f0915d7 abort_with_payload_wrapper_internal + 82
2 libsystem_kernel.dylib
0x00007fff5f091585 abort_with_reason + 22
3 libobjc.A.dylib
0x00007fff5d7768dd _objc_fatalv(unsigned long long, unsigned long long, char const*, __va_list_tag*) + 108
4 libobjc.A.dylib
0x00007fff5d77678f _objc_fatal(char const*, ...) + 135
5 libobjc.A.dylib
0x00007fff5d77760f performForkChildInitialize(objc_class*, objc_class*) + 341
6 libobjc.A.dylib
0x00007fff5d77862f initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt
0x00007fff5d767690 lookUpImpOrForward + 228
8 libobjc.A.dylib
0x00007fff5d767114 _objc_msgSend_uncached + 68
9 com.apple.CoreFoundation
0x00007fff32fcb4a9 _createUniqueStringWithUTF8Bytes + 179
10 com.apple.CoreFoundation
0x00007fff32f8e27f parseStringTag + 1531
11 com.apple.CoreFoundation
0x00007fff32f8c041 parseXMLElement + 846
12 com.apple.CoreFoundation
0x00007fff32f8c6ee parseXMLElement + 2555
13 com.apple.CoreFoundation
0x00007fff32f8c156 parseXMLElement + 1123
14 com.apple.CoreFoundation
0x00007fff32f8b686 _CFPropertyListCreateFromUTF8Data + 1884
15 com.apple.CoreFoundation
0x00007fff33075157 _CFPropertyListCreateWithData + 590
16 com.apple.CoreFoundation
0x00007fff32f8aa12 CFPropertyListCreateWithData + 42
17 grpc_c.bundle
0x000000010e894ed0 parseSystemVersionPList + 528
18 ???
0x48d23145c88948c1 0 + 5247310691626993857
The crash happens in parseSystemVersionPList which is LLVM runtime function.
official package | macOS 10.14.6LLVM 10.0.1 | macOS 11.4LLVM 12.0.5 | |
---|---|---|---|
macOS 10.14.6 | y | y | y |
macOS 11.4 | y | y | n |
macOS 11.3.1 | y (user report) | no env | no env |
Having this same issue trying to run any example of google-ads
under Rails. Already tried to add the "workaround" on my ~/.zshrc
but the problem persists.
My output:
CID: 6808369803, Host: googleads.googleapis.com:443, Method: /google.ads.googleads.v8.services.GoogleAdsService/SearchStream, IsFault: no, Request ID: N/A
objc[22477]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[22477]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
My environment:
google-ads-googleads (13.0.0)
grpc (1.38.0)
google-protobuf (~> 3.15)
googleapis-common-protos-types (~> 1.0)
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-darwin20]
Rails 6.1.4
System Version: macOS 11.4 (20F71)
Kernel Version: Darwin 20.5.0
UPDATE Just updated to latest mac os available version and fixed my problem
System Version: macOS 11.5.1 (20G80)
Kernel Version: Darwin 20.6.0
great! Thanks for the update
I can still reproduce this issue:
thomas@Thomass-MBP reproduce-google-ads-crash % bundle exec ruby main.rb
Could not create default logger. Check your config file.
objc[73930]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[73930]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Using the following gem versions
google-ads-googleads (13.0.0)
google-protobuf (3.17.3)
grpc (1.38.0)
Using this ruby:
% ruby -v
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x86_64-darwin20]
Using macos 11.5.1 (20G80)
I even tried to match your ruby version and it's still crashing:
% ruby -v
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-darwin20]
I can still reproduce this issue:
thomas@Thomass-MBP reproduce-google-ads-crash % bundle exec ruby main.rb Could not create default logger. Check your config file. objc[73930]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. objc[73930]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Using the following gem versions
google-ads-googleads (13.0.0) google-protobuf (3.17.3) grpc (1.38.0)
Using this ruby:
% ruby -v ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [x86_64-darwin20]
Using macos
11.5.1 (20G80)
I even tried to match your ruby version and it's still crashing:
% ruby -v ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-darwin20]
could you please try compiling grpc from the source? There are some libs need to be updated too.
could you please try compiling grpc from the source?
I added
gem 'grpc', github: 'grpc/grpc', submodules: true
to my Gemfile and it didn't crash!
I'm seeing this as well. Seems to occur when the gRPC server is down.
[1] pry(main)> require_relative './lib/grpc/health_services_pb'
=> true
[2] pry(main)> stub = Grpc::Health::V1::Health::Stub.new(Gruf.server_binding_url, :this_channel_is_insecure)
=> #<Grpc::Health::V1::Health::Stub:0x0000000105491cf8
@ch=#<GRPC::Core::Channel:0x0000000105491758>,
@host="127.0.0.1:9001",
@interceptors=#<GRPC::InterceptorRegistry:0x0000000105491690 @interceptors=[]>,
@propagate_mask=nil,
@timeout=1969-12-31 18:59:59 -0500>
[3] pry(main)> stub.check Grpc::Health::V1::HealthCheckRequest.new
=> <Grpc::Health::V1::HealthCheckResponse: status: :SERVING>
[4] pry(main)> stub.check Grpc::Health::V1::HealthCheckRequest.new
objc[44682]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called.
objc[44682]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Between 3 and 4 I shut the gRPC server down.
- Ruby: ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [arm64-darwin21]
- grpc gem: 1.48.0
- macOS: 12.3.1
I'm seeing this as well. Seems to occur when the gRPC server is down.
[1] pry(main)> require_relative './lib/grpc/health_services_pb' => true [2] pry(main)> stub = Grpc::Health::V1::Health::Stub.new(Gruf.server_binding_url, :this_channel_is_insecure) => #<Grpc::Health::V1::Health::Stub:0x0000000105491cf8 @ch=#<GRPC::Core::Channel:0x0000000105491758>, @host="127.0.0.1:9001", @interceptors=#<GRPC::InterceptorRegistry:0x0000000105491690 @interceptors=[]>, @propagate_mask=nil, @timeout=1969-12-31 18:59:59 -0500> [3] pry(main)> stub.check Grpc::Health::V1::HealthCheckRequest.new => <Grpc::Health::V1::HealthCheckResponse: status: :SERVING> [4] pry(main)> stub.check Grpc::Health::V1::HealthCheckRequest.new objc[44682]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[44682]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Between 3 and 4 I shut the gRPC server down.
- Ruby: ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [arm64-darwin21]
- grpc gem: 1.48.0
- macOS: 12.3.1
This seems like something new. Could you open a new issue, and post the code and how to reproduce please?
2022-09-20 13:13:16.196 INFO ai.backend.manager.server [19094] Loading module: etcd 2022-09-20 13:13:16.184 INFO ai.backend.manager.server [19093] Backend.AI Manager 22.09.0b6 2022-09-20 13:13:16.184 INFO ai.backend.manager.server [19093] runtime: Python 3.10.7 (env: /Users/kyujin/Projects/lablup/backend.ai/dist/export/python/virtualenvs/python-default/3.10.7) 2022-09-20 13:13:16.184 DEBUG ai.backend.manager.config [19093] debug mode enabled. 2022-09-20 13:13:16.503 INFO ai.backend.manager.server [19094] Loading module: events 2022-09-20 13:13:16.505 INFO ai.backend.manager.server [19094] Loading module: auth 2022-09-20 13:13:16.505 INFO ai.backend.manager.server [19094] Loading module: ratelimit 2022-09-20 13:13:16.505 INFO ai.backend.manager.server [19094] Loading module: vfolder 2022-09-20 13:13:16.512 INFO ai.backend.manager.server [19094] Loading module: admin 2022-09-20 13:13:16.522 INFO ai.backend.manager.server [19094] Loading module: session 2022-09-20 13:13:16.527 INFO ai.backend.manager.server [19094] Loading module: stream 2022-09-20 13:13:16.530 INFO ai.backend.manager.server [19094] Loading module: manager 2022-09-20 13:13:16.530 INFO ai.backend.manager.server [19094] Loading module: resource 2022-09-20 13:13:16.530 INFO ai.backend.manager.server [19094] Loading module: scaling_group 2022-09-20 13:13:16.530 INFO ai.backend.manager.server [19094] Loading module: cluster_template 2022-09-20 13:13:16.531 INFO ai.backend.manager.server [19094] Loading module: session_template 2022-09-20 13:13:16.532 INFO ai.backend.manager.server [19094] Loading module: image 2022-09-20 13:13:16.552 INFO ai.backend.manager.server [19094] Loading module: userconfig 2022-09-20 13:13:16.553 INFO ai.backend.manager.server [19094] Loading module: domainconfig 2022-09-20 13:13:16.553 INFO ai.backend.manager.server [19094] Loading module: groupconfig 2022-09-20 13:13:16.554 INFO ai.backend.manager.server [19094] Loading module: logs 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19098] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19101] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19094] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19100] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19099] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19095] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19097] using etcd cluster from 127.0.0.1:8121 with namespace "local" 2022-09-20 13:13:16.555 INFO ai.backend.common.etcd [19096] using etcd cluster from 127.0.0.1:8121 with namespace "local" objc[19097]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19097]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19094]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19100]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19096]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19094]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19100]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19096]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19095]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19099]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19095]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19099]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. objc[19101]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19098]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. objc[19101]: +[__NSTimeZone initialize] may have been in progress in another throbjc[19098]: +[__NSTimeZone initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Same here. Suffering from this issue after updating grpcio to 1.48.1. Managed to prevent grpcio from crashing by adding environment variable OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
when starting python process.
I worked around this problem by calling CFTimeZoneCopyDefault
in a pre-fork block:
# frozen_string_literal: true
if RUBY_PLATFORM.include?('darwin')
require 'fiddle'
require 'ffi'
module CFTimeZone
extend FFI::Library
ffi_lib '/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation'
attach_function :CFTimeZoneCopyDefault, [], :pointer
attach_function :CFRelease, [:pointer], :void
end
# Dynamically load Foundation.framework, ~implicitly~ initializing
# the Objective-C runtime before any forking happens in webserver
#
# From https://bugs.ruby-lang.org/issues/14009
Fiddle.dlopen '/System/Library/Frameworks/Foundation.framework/Foundation'
# grpc uses abseil-cpp to retrieve the local time zone via macOS APIs:
# https://github.com/abseil/abseil-cpp/blob/20230125.rc3/absl/time/internal/cctz/src/time_zone_lookup.cc#L139
#
# To ensure these APIs are not called in a forked process (https://github.com/grpc/grpc/issues/26257),
# load the required framework, retrieve the default time zone, and then release the resource.
default_time_zone = CFTimeZone.CFTimeZoneCopyDefault
CFTimeZone.CFRelease(default_time_zone)
end
I suspect if absl::InitializeLog()
were called in a pre-fork block, we'd be able to avoid this.
I've created https://github.com/grpc/grpc/issues/33281 for https://github.com/grpc/grpc/issues/26257#issuecomment-1213348552.
original issue solved by updating environment, new issue tracked in https://github.com/grpc/grpc/issues/33281, closing.