packing into struct timeval should be architecture-dependent, not just pack('l_2')
Hi,
In Debian we ran into a regression on armel and armhf architectures. A MWE is:
require 'dalli'
dc = Dalli::Client.new('localhost:11211', { namespace: "app_v1" })
dc.flush_all
Which fails with:
/usr/share/rubygems-integration/all/gems/dalli-3.2.8/lib/dalli/socket.rb:133:in `setsockopt': Invalid argument - setsockopt(2) (Errno::EINVAL)
^Ifrom /usr/share/rubygems-integration/all/gems/dalli-3.2.8/lib/dalli/socket.rb:133:in `init_socket_options'
^Ifrom /usr/share/rubygems-integration/all/gems/dalli-3.2.8/lib/dalli/socket.rb:94:in `block in open'
^Ifrom /usr/share/rubygems-integration/all/gems/dalli-3.2.8/lib/dalli/socket.rb:110:in `create_socket_with_timeout'
^Ifrom /usr/share/rubygems-integration/all/gems/dalli-3.2.8/lib/dalli/socket.rb:92:in `open'
[...]
Or with strace:
[pid 59410] setsockopt(5, SOL_SOCKET, SO_RCVTIMEO_NEW, "\1\0\0\0\0\0\0\0", 8) = -1 EINVAL (Invalid argument)
It boils down to this code in lib/dalli/socket.rb, introduced in https://github.com/petergoldstein/dalli/pull/1025 (cc @petergoldstein ) :
seconds, fractional = options[:socket_timeout].divmod(1)
timeval = [seconds, fractional * 1_000_000].pack('l_2')
sock.setsockopt(::Socket::SOL_SOCKET, ::Socket::SO_RCVTIMEO, timeval)
sock.setsockopt(::Socket::SOL_SOCKET, ::Socket::SO_SNDTIMEO, timeval)
The problem here is that SO_RCVTIMEO expects a struct timeval, but packing the time values into it depends on the architecture, and 'l_2' is incorrect on Debian's armel and armhf (probably because they transitioned to 64b time_t, see https://wiki.debian.org/ReleaseGoals/64bit-time).
I'm not sure of how to properly fix this in pure ruby, sorry.
The Debian bug is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1095247
The line you're referencing wasn't introduced in https://github.com/petergoldstein/dalli/pull/1025 since that code hasn't been released yet. If you're on a Ruby version >3.0, there's a good chance those timeouts don't even work. @petergoldstein could we cut a release that includes the timeout change? My patch should fix @lnussbaum's issue
The line you're referencing wasn't introduced in #1025 since that code hasn't been released yet. If you're on a Ruby version >3.0, there's a good chance those timeouts don't even work. @petergoldstein could we cut a release that includes the timeout change? My patch should fix @lnussbaum's issue
Ah sorry I got the history wrong indeed.
The line was introduced earlier, and was in v3.2.8 (which we ship in Debian 'testing' branch): https://github.com/petergoldstein/dalli/blob/v3.2.8/lib/dalli/socket.rb#L131
It was introduced in https://github.com/petergoldstein/dalli/pull/977
After thinking about it, I see two ways of addressing this in pure ruby: 1/ first use getsockopt, then try the various options for generating a struct timeval until one of those has the same bit length as the one received via getsockopt 2/ try the various options with setsockopt, rescueing the exception, until one is accepted
None of those options is particularly nice, but both should do the job... Any other ideas?
Hi,
In Debian I fixed it with the following patch.
Index: ruby-dalli/lib/dalli/socket.rb
===================================================================
--- ruby-dalli.orig/lib/dalli/socket.rb
+++ ruby-dalli/lib/dalli/socket.rb
@@ -126,8 +126,26 @@ module Dalli
seconds, fractional = options[:socket_timeout].divmod(1)
microseconds = fractional * 1_000_000
- timeval = [seconds, microseconds].pack('l_2')
+ # struct timeval is (time_t, suseconds_t), which translates to either (int32_t, long) or
+ # (int64_t, long) depending on the architecture. For example, on Debian, all 64b
+ # architectures have int64_t for time_t, but for 32b architectures, it depends:
+ # armel and armhf use int64_t, while i386 stayed with int32_t.
+ # Unfortunately Array::pack does not know about time_t. So we generate both candidates,
+ # use getsockopt to get a timeval, and compare sizes.
+ # Supported timeval formats:
+ timeval_formats = [
+ 'q l_', # 64-bit time_t, 64-bit long (e.g., amd64, arm64)
+ 'l l_', # 32-bit time_t, 32-bit long (e.g., i386)
+ 'q l_ x4' # 64-bit time_t, 32-bit long + padding (e.g., armel on Debian)
+ ]
+ timeval_args = [seconds, microseconds]
+ expected_length = sock.getsockopt(::Socket::SOL_SOCKET, ::Socket::SO_RCVTIMEO).data.length
+ timeval_format = timeval_formats.find do |fmt|
+ timeval_args.pack(fmt).length == expected_length
+ end
+ raise Dalli::DalliError,"Unable to determine appropriate timeval format" unless timeval_format
+ timeval = timeval_args.pack(timeval_format)
sock.setsockopt(::Socket::SOL_SOCKET, ::Socket::SO_RCVTIMEO, timeval)
sock.setsockopt(::Socket::SOL_SOCKET, ::Socket::SO_SNDTIMEO, timeval)
end