timeliness
timeliness copied to clipboard
Fix the month detection for lowercase and long abbreviated month names. Fix ddd and mmm formats to make them fit more locales.
Let's try the usual case:
>> I18n.locale = :en
:en
>> "18 October 2013".to_date
Fri, 18 Oct 2013
>> "18 october 2013".to_date
Fri, 18 Oct 2013
>> "18 oct 2013".to_date
Fri, 18 Oct 2013
>> "18 Oct 2013".to_date
Fri, 18 Oct 2013
So far everything is ok. But in fact there are a lot of locales, which have lowercase and long abbreviated month names, like 'octobre' and 'oct.' in French.
>> I18n.locale = :fr
:fr
>> "18 Octobre 2013".to_date
ArgumentError: invalid date
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `new'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `to_date'
from (irb):39
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
from /home/exoth/projects/jetradar/script/rails:6:in `require'
from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
>> "18 octobre 2013".to_date
ArgumentError: invalid date
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `new'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `to_date'
from (irb):41
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
from /home/exoth/projects/jetradar/script/rails:6:in `require'
from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
>> "18 oct. 2013".to_date
NoMethodError: undefined method `map' for nil:NilClass
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:11:in `to_date'
from (irb):43
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
from /home/exoth/projects/jetradar/script/rails:6:in `require'
from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
>> "18 Oct. 2013".to_date
NoMethodError: undefined method `map' for nil:NilClass
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:11:in `to_date'
from (irb):45
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
from /home/exoth/projects/jetradar/script/rails:6:in `require'
from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
So that pull request fixes the month detection behavior to properly handle such situations.
And the second part is about regexp used for ddd and mmm formats.
Here's a simple code, checking how well the regexps perform against translations from all the locales found in the rails-i18n gem:
def check key, regexp
bad = []
locales = Dir['/home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/rails-i18n-0.7.4/rails/locale/*']
locales.each do |file|
q = YAML.load(File.read(file))
bad << q.keys.first if q.values.first["date"][key].compact.any?{|i| i !~ regexp}
end
puts "#{bad.count}/#{locales.count} miss the regexp: #{bad.join(', ')}."
end
>> check 'month_names', /^\w{3,9}$/
check 'month_names', /^\w{3,9}$/
59/80 miss the regexp: es-AR, th, es-VE, el, is, az, de, tr, he, cy, zh-CN, fr-CH, lv, ca, hi-IN, pl, et, zh-HK, fr, mk, hi, hr, sk, uk, lt, es-PE, ro, pt, fa, pt-BR, lo, ne, es-419, kn, gl, ar, hu, fi, cs, es, sw, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, wo, fr-CA, ko, vi, bn, de-CH, mn.
>> check 'abbr_month_names', /^\w{3,9}$/
43/80 miss the regexp: th, el, is, az, de, tr, he, zh-CN, fr-CH, lv, ca, uz, pl, et, zh-HK, fr, mk, hi, hr, sk, uk, fa, lo, ne, gl, ar, hu, fi, cs, zh-TW, ja, eo, bg, sr, or, de-AT, ru, fr-CA, ko, vi, bn, de-CH, mn.
>> check 'day_names', /^\w{3,9}$/
66/80 miss the regexp: es-AR, th, es-VE, el, sl, is, az, de, tr, he, cy, sv, zh-CN, id, lv, nb, uz, hi-IN, pl, et, zh-HK, mk, hi, hr, sk, uk, lt, es-PE, it, ro, nn, bs, pt, it-CH, fa, pt-BR, da, lo, ne, tl, es-419, kn, gl, ar, hu, fi, eu, cs, es, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, wo, ko, vi, bn, de-CH, mn.
>> check 'abbr_day_names', /^\w{3,9}$/
check 'abbr_day_names', /^\w{3,9}$/
61/80 miss the regexp: es-AR, th, es-VE, el, sl, is, az, de, tr, he, sv, zh-CN, rm, lv, ca, nb, uz, hi-IN, pl, et, zh-HK, mk, hi, hr, sk, uk, lt, es-PE, ro, nn, bs, pt, fa, pt-BR, da, lo, ne, es-419, kn, ar, hu, fi, cs, es, sw, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, ko, vi, bn, de-CH, mn.
So there are quite a lot of misses. So far these are the best results I've managed to get by making minor changes to the regexps:
>> check 'month_names', /^[[[:alnum:]]\.]{2,11}$/
11/80 miss the regexp: th, hi-IN, hi, ne, kn, sw, or, wo, vi, bn, mn.
>> check 'abbr_month_names', /^[[[:alnum:]]\.]{2,11}$/
8/80 miss the regexp: ca, hi, lo, ne, or, vi, bn, mn.
>> check 'day_names', /^[[[:alnum:]]\.]{1,14}$/
13/80 miss the regexp: th, az, cy, id, hi-IN, hi, uk, fa, ne, kn, or, vi, bn.
>> check 'abbr_day_names', /^[[[:alnum:]]\.]{1,14}$/
8/80 miss the regexp: hi-IN, hi, lo, ne, kn, or, vi, bn.
Much better already, though there are still some locales left, but they mostly involve multiword names, and this probably can't be fixed just by changing the format regexps. The way I see is to use the list of month/day names of the current locale at the stage of format processing.
Anyway, the pull request includes the changes in formats as specified above.
Now when this solution couldn't match the Thai February month name translation I decided to go even further and defined mmm like this:
months = I18n.available_locales.flat_map do |locale|
I18n.t('date.month_names', locale: locale) + I18n.t('date.abbr_month_names', locale: locale)
end
months.compact.uniq.map { |month| Regexp.escape(month) }.join('|')
and enabled case insensitive search. This way I'm guaranteed to match any possible month translation. But regexp is much bigger now and I don't know how it affects the performance. Anyway tell me if you find this solution suitable for the timeliness gem and I'll make a pull request.