mimirsbrunn
mimirsbrunn copied to clipboard
Support ElasticSearch 5
- [ ] geopoint don't support anymore "geohash_*" and "lat_lon"
- [ ] Scan has been removed, it has to be replaced by a scroll. rs-es doesn't support it yet.
- [ ] queries has to be updated. geocoder-tester has a lot more of failure with ES 5. Might be caused by some changed in the scoring algorithm
FYI, ES 2.4 end of life is end of February :cry:
https://www.elastic.co/support/eol
I created an issue in rs-es for the migration since I don't really know how to proceed on the changes while maintaining the ES2 compatibility.
another possibility would be to migrate from rs-es to elastic-rs. This crate is already ES5 compatible (and thinking about ES6).
The crate is quite different from rs-es in that they provide strongly typed response (like rs-es) but weakly typed queries.
This means that the query cannot be compile-time checked, but they also are easier to write (copy/paste and existing curl query and no additional dsl to learn). it also seems easier for future ES version compatibility since we care less about the query.
Another difference is that the strongly typed documents need to be adapted to derive an ElasticType
(and not only serde like rs-es). This makes it possible for them in auto-generate the mapping from the document (this is nice since the mapping is always synced, but I don't know if everything we use in the mapping is available) (seen here).
The crate seems maintained (well by only 2 main contributors, but it's rust :sweat_smile: ), and the crates.io stats are not bad compared to rs-es.
As a (nice) bonus this crate can be used asynchronously :tada:
The BIG downside is that the migration will be quite painful and I don't know if it will work out-of-the-box (and even if it will be possible) on ES2.
Do you think it's worth exploring this possibility ?
IMHO that's an interesting possibility.
WRT migrating to elastic-rs: that is a large undertaking, since the rs-es API is used all over the place with lots of hardcoding and no abstraction at all.
I still think this is by far the best option, but it will require a large amount of work. And It will touch nearly all files, libs and functions, so this needs to be coordinated.
Instead of elastic-rs the official rust client by elastic can be considered as well.
I see four possibilities:
- create a branch and accept PRs that move pieces of code from rs_es to elastic-rs. Once all code is migrated this branch can be merged into master.
- Allow both to exist in one project. New, or touched code should slowly move from rs_es to elastic-rs.
- refactor all current rs_es into an abstraction (I'd suggest the adapter pattern) and once done, start with 1. or 2.
- Halt all new features, bugfixes and changes untill the migration is finished.
Option 1. has the downside that a lot of work has to be kept in-sync. Probably causing a lot of work rebasing and resolving conflicts.
Option 2. has the downside that for a while, it becomes (even more) messy. With a risk of never really finishing in some hardly used or dusty corners of the package. It also requires the new library to support the eol-es2 version.
Option 3. requires a lot of up-front work, without seeing any improvements. After that, all the hard work to migrate still needs to be done, but will be slightly easier.
Option 4. has the downside that every contribution will be outdated and incompatible by the time those few working on the large refactoring are done.
I'd prefer Option 2. Since that allows new features and bugfixes to migrate small pieces over and reap the benefit immediately.
Personally, I'm not happy with how rs-es works. It promises strongly typed and compile-time error checking, but especially the geo-features are often simply not working (if ever at all). E.g. bounding-box filters and, distance-sorting creates the wrong json, causing panics because ES cannot parse the JSON. Yet, it being stronly typed, makes it impossible to craft your own, correct JSON to work around such bugs.
Thanks for updating this issue. The decision will probably depend mostly on what the plans of CanalTP / Kisio team are for the coming months. Especially, can we consider dropping support of Elasticsearch 2 completely in a near-future release, or should the new releases necessarily support both ES2 and ES7 ?
My two cents about this:
- Building an abstraction to support both ES2 and ES7 could be challenging because of subtle differences about their indexing behavior, and may not be worth the effort if ES2 support is dropped soon in the end.
- I agree Option 2. seems to be the best option for a graceful migration. But neither
elastic
orelasticsearch
crates support Elasticsearch 2, so this option would still require to keeprs-es
as long as we want to support ES2. - Development of mimirsbrunn has not been very active recently, as far as I can see. If there is not too many changes ongoing, Option 1. could be reasonable.