dry-types
dry-types copied to clipboard
fast coercion date and time
I benchmarked dry-types and activemodel and noticed that dry-types has low perfomance, and I dip inside code rails
https://github.com/rails/rails/blob/main/activemodel/lib/active_model/type/helpers/time_value.rb#L72 https://github.com/rails/rails/blob/main/activemodel/lib/active_model/type/date.rb#L30
rails use regexp for frequent case
@ermolaev why isn't it a patch to Ruby?
I mean it sounds strange we invent "performance-optimized" code and copy it between line libraries... What if there's another, even faster set of regex? :)
Just what I said lol https://github.com/rails/rails/commit/9ea15f1927393a88c0a5dbe250420b7afdfa6aed
Another thought (sorry for spamming): we should provide a way to fix the format to ISO8601. Mb it should be a different key, e.g. :iso_time and :iso_date. It'll be 💯 faster.
why isn't it a patch to Ruby?
ruby has other methods for parsing time in strict formats
Time.iso8601("2011-10-05T22:26:12-04:00")
Time.strptime("2000-10-31", "%Y-%m-%d")
this methods is fast, but this is a very strict format logic, this methods throw exeptions, and for fallback we need handle it, but exeptions is very slow
Dry::Types["params.date"] Dry::Types["json.date"] apply string in customs format and it more robust code for parsing times
Another thought (sorry for spamming): we should provide a way to fix the format to ISO8601. Mb it should be a different key, e.g. :iso_time and :iso_date. It'll be 100 faster.
I think new keys will be confuse developers, and nobody will use them, also better to encapsulate the parsing details in the library
this methods is fast, but this is a very strict format logic, this methods throw exeptions, and for fallback we need handle it, but exeptions is very slow
But we don't define Date.parse and Time.parse either. Shouldn't they be improved in Ruby instead? Currently, what we discuss here started as a custom parsing method for parsing time in the mysql adapter (!) in rails (https://github.com/rails/rails/commit/bd087e6716f8b48d48386c3d42e529849d7b68cd). It looks like nobody tried to backport it to Ruby itself.
I think new keys will be confuse developers, and nobody will use them, also better to encapsulate the parsing details in the library
These are strong assumptions and I don't share them. It's true that being faster by default is better but we also need to consider the price (support, bugs, etc). In my opinion, parsing dates without a predefined format is a potential source of errors. Delegating format detection to Time.parse you create an implicit dependency on how it operates. Not only it's slower, but it can also change quite randomly accepting more/fewer formats. This is likely not a problem for params but I'd expect json types to be more strict.
This is what I use personally
TypeContainer.register(
'json.iso_time',
::Types::Time.constrained(
gteq: Time.new(2014, 1, 1),
lteq: Time.new(2038, 1, 1)
).constructor { |value, &failed|
Try[::ArgumentError] { ::Time.iso8601(value.sub(' ', 'T')) }.value_or {
failed.(value)
}
}
)
I'm not against the change (though I'm not sure about the implementation) but I think we should try adding it to Ruby first.
Side note: is there a safe strptime in Ruby?
I looked it up, seems like using Date._strptime or Date._iso8601 would be faster and cleaner. I checked, they are even defined in JRuby. WDYT?
I mean it sounds strange we invent "performance-optimized" code and copy it between line libraries... What if there's another, even faster set of regex? :)
Ruby has regexp for Time.iso8601 https://github.com/ruby/ruby/blob/v3_0_0/lib/time.rb#L614 :scream:
I looked it up, seems like using Date._strptime or Date._iso8601 would be faster and cleaner. I checked, they are even defined in JRuby. WDYT?
Date benchmark
ISO_DATE = /\A(\d{4})-(\d\d)-(\d\d)\z/
string = '2011-12-03'
Benchmark.ips do |x|
x.report('parse') do |n|
while n > 0
Date.parse(string)
n-=1
end
end
x.report('_parse') do |n|
while n > 0
::Date.new(*::Date._parse(string, false).values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_iso8601') do |n|
while n > 0
::Date.new(*::Date._iso8601(string).values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_strptime') do |n|
while n > 0
::Date.new(*Date._strptime(string, '%Y-%m-%d').values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_strptime.values') do |n|
while n > 0
::Date.new(*Date._strptime(string, '%Y-%m-%d').values)
n-=1
end
end
x.report('regexp') do |n|
while n > 0
if string =~ ISO_DATE
::Date.new($1.to_i, $2.to_i, $3.to_i)
end
n-=1
end
end
x.compare!
end
# Comparison:
# _strptime.values: 1282303.3 i/s
# _strptime: 1194136.2 i/s - same-ish: difference falls within error
# regexp: 997567.6 i/s - 1.29x (± 0.00) slower
# _iso8601: 529221.1 i/s - 2.42x (± 0.00) slower
# parse: 282310.3 i/s - 4.54x (± 0.00) slower
# _parse: 270369.9 i/s - 4.74x (± 0.00) slower
Time benchmark
ISO_DATETIME = /
\A
(\d{4})-(\d\d)-(\d\d)(?:T|\s) # 2020-06-20T
(\d\d):(\d\d):(\d\d)(?:\.(\d{1,6})\d*)? # 10:20:30.123456
(?:(Z(?=\z)|[+-]\d\d)(?::?(\d\d))?)? # +09:00
\z
/x.freeze
def fast_string_to_time(string)
return unless ISO_DATETIME =~ string
usec = $7.to_i
usec_len = $7&.length
if usec_len&.< 6
usec *= 10**(6 - usec_len)
end
if $8
offset = $8 == "Z" ? 0 : $8.to_i * 3600 + $9.to_i * 60
end
::Time.local($1.to_i, $2.to_i, $3.to_i, $4.to_i, $5.to_i, $6.to_i, usec, offset)
end
string = '2021-02-04T01:48:47.551+03:00'
Benchmark.ips do |x|
x.report('parse') do |n|
while n > 0
::Time.parse(string)
n-=1
end
end
x.report('iso8601') do |n|
while n > 0
::Time.iso8601(string)
n-=1
end
end
x.report('_parse') do |n|
while n > 0
::Time.new(*::Date._parse(string, false).values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('_iso8601') do |n|
while n > 0
::Time.new(*::Date._iso8601(string).values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('_strptime') do |n|
while n > 0
::Time.new(*Date._strptime(string, '%Y-%m-%dT%H:%M:%S.%N%z').values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('regexp') do |n|
while n > 0
fast_string_to_time(string)
n-=1
end
end
x.compare!
end
# Comparison:
# regexp: 201472.4 i/s
# _strptime: 159658.5 i/s - 1.26x (± 0.00) slower
# _iso8601: 140898.1 i/s - 1.43x (± 0.00) slower
# iso8601: 138998.2 i/s - 1.45x (± 0.00) slower
# _parse: 66379.7 i/s - 3.04x (± 0.00) slower
# parse: 57152.7 i/s - 3.53x (± 0.00) slower
Thanks for this! A minor heads up. I benchmarked various options and looked into the ruby sources. It turns out that ruby itself relies on regular expressions internally. My benchmarking showed approx 20% difference between ._sptrptime and regexes though ._strptime is more accurate (doesn't parse invalid dates while regexes aren't so strict). I'm not yet sure which way is better, I'll get back to it sometime soon.