timeshape
timeshape copied to clipboard
Initialize library filtering by Continent/Country
In timeshape today, there are two methods for initializing the library
- with the data for the whole world
- with some bounding box only, to reduce memory usage
It would be interesting to have a third option for initializing with the data of a continent or a country because the main scopes of use are political following the patterns:
- whole world (international usage)
- my country (USA, Brazil, etc...)
- my region (European Union, South America, Asia, etc...)
This division could be made according to IANA existing divisions.
I was thinking about something similar as well. These cases, while sounding somewhat similar, have quite different implementation complexity, so let's take them one by one:
- Initialize for a given continent/region, like Europe, Asia, etc. This could be done relatively easily, if we could use prefix from the time zone id. E.g. given prefix Europe, time zones like
Europe/Berlin
orEurope/Stockholm
will be included, but e.g.Asia/Tomsk
will not. - Initialize based on country. This is a slippery slope. Given the fact that there are a lot of disputed territories in the world (see e.g. https://github.com/RomanIakovlev/timeshape/issues/27), I'd rather avoid making any decision to which country any given timezone belongs. Besides that, it would require adding more dependencies to the Timeshape library, which I'd rather avoid, because I want to minimize usage of popular dependencies, and keep the library small and tidy.
But I think more ways to limit the loaded time zones would be a useful addition to the Timeshape, and therefore I have a couple of proposals. Firstly, we can introduce one more way to initialize, which would accept Set<java.time.ZoneId>
. Only those time zones will be considered for search. The users of Timeshape will obtain the list of interesting time zones in whatever way they find appropriate, e.g. by using Time4j.
Secondly, we can implement an option which accepts a regular expression. Only time zones, id's of which match this regex, will be considered for search. This is a generalization of point 1. mentioned above.
Both of these options are relatively easy to implement. Do you think any of these will be useful?
I am thinking that the above proposals are good enough and should cover the needs quite well.
- Using regular expressions are powerfull and flexible and a excelent choice for this case. I'm not sure, however, if some devs are comfortable using regexp. Maybe this will need more detailed documentation with some examples.
- Filtering by
Set<java.time.ZoneId>
should be equivalent to filtering bySet<String>
of timezone names. Not sure is worth... - I agree with you that it's not worth adding more dependencies.
- It seems to me that it's better not to invest so much time in this feature and redirect the efforts for other things, like library evolution, easy of maintanance, etc...
Ok then, let's implement the regex option.
Instead of limiting, could you provide a lazy loading by splitting into region-specific indexes? As in, if you know the bounding box for a continent then you can query determine which Index
to delegate to. Then only the sub-indexes being used would be in-memory, reducing the footprint for development and localized application.
@ben-manes right, I've been thinking about implementing some sort of lazy loading as well. I'm not sure what's the best way to implement this, but I think it might be a useful addition to the Timeshape, since, although the lazy loading will always incur some sort of performance penalty, for some use cases it might be negligible. Lazy loading might even be preferable, if initialization time will be cut down. E.g. if you use some FaaS (e.g. AWS Lambda), where you only need to query one position (of unpredictable origin), you shouldn't be paying the price of full initialization.
I guess, I'll just open another issue to keep track of and discuss the lazy loading.
What should be the impact on size/performance when the shapes are simplified? There are any numbers on this?
Closing due to lack of activity. If there's still interest in working on this, anyone should feel free to pick up this work, and I'll provide support with code reviews and releases.