query-translator icon indicating copy to clipboard operation
query-translator copied to clipboard

WIP: Add ranges support

Open thePanz opened this issue 6 years ago • 16 comments

  • [x] Support for [a TO b] ranges
  • [x] Support for {a TO b} ranges
  • [x] Support for asymmetric ranges (such as [a TO b})
  • [x] Support for ranges with dates xxxx-xx-xx format
  • [x] Support for * in ranges, example: [123 TO *]
  • [ ] Support for quoted string in ranges, example: ["2008-07-28T14:47:31Z" TO NOW]

thePanz avatar Apr 04 '18 18:04 thePanz

Codecov Report

Merging #14 into master will not change coverage. The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##             master    #14   +/-   ##
=======================================
  Coverage       100%   100%           
- Complexity      260    279   +19     
=======================================
  Files            47     49    +2     
  Lines           689    743   +54     
=======================================
+ Hits            689    743   +54
Flag Coverage Δ Complexity Δ
#all 100% <100%> (ø) 279 <17> (+19) :arrow_up:
Impacted Files Coverage Δ Complexity Δ
lib/Languages/Galach/Tokenizer.php 100% <ø> (ø) 3 <0> (ø) :arrow_down:
lib/Languages/Galach/TokenExtractor/Full.php 100% <100%> (ø) 10 <2> (+4) :arrow_up:
lib/Languages/Galach/Values/Token/Range.php 100% <100%> (ø) 3 <3> (?)
lib/Languages/Galach/Generators/Native/Range.php 100% <100%> (ø) 12 <12> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a06fe70...e9ca9e8. Read the comment docs.

codecov-io avatar Apr 04 '18 18:04 codecov-io

This looks great :) Do you plan to go for the exclusive ranges here as well?

pspanja avatar Apr 05 '18 07:04 pspanja

@pspanja the exclusive should be "easy" to achive, but there is something not clear to me in the regexp you're using in the phrase token: What's the (?<!\\\\) needed for? should it be used in the range too?

And for now, the "208-04-05" format in the range is not supported yet

thePanz avatar Apr 05 '18 08:04 thePanz

@pspanja inclusive/exclusive range added, I'll add later the support for the date ranges

thePanz avatar Apr 05 '18 08:04 thePanz

What's the (?<!\\\\) needed for? should it be used in the range too?

It's negative lookbehind, see here: http://www.rexegg.com/regex-lookarounds.html Basically it forbids preceding backslash, useful in phrase matching to allow everything until unescaped quote.

About left and right value in range matching - I think we should somehow reuse the pattern for matching a word. Will take a closer look in the evening.

pspanja avatar Apr 05 '18 12:04 pspanja

Thanks for the explanation @pspanja! Should the range implementation be split into inclusive/exclusive (as this patch) and an additional MR for the other cases?

thePanz avatar Apr 05 '18 18:04 thePanz

Should the range implementation be split into inclusive/exclusive (as this patch) and an additional MR for the other cases?

It can be split, I'm OK with that.

Thinking about the implementation a bit, handling both two-sided and one-sided ranges seems like too much both for the regex and token, so should probably be handled with a separate pattern and a separate token.

pspanja avatar Apr 05 '18 21:04 pspanja

Hi guys, thank you for this cool PR! @pspanja Can it be merged?

TomasPilar avatar Sep 08 '18 14:09 TomasPilar

Hi @TomasPilar, I'll try to find time in the next few days to see how to push this further.

pspanja avatar Oct 09 '18 17:10 pspanja

FYI: this MR has been used in a pre-production system from April 6 :)

thePanz avatar Oct 09 '18 20:10 thePanz

This looks like a very useful extension — any chance it will be merged/released in the near future?

j13k avatar Jun 18 '19 06:06 j13k

Hi @j13k, I've neglected this because I was too busy, but I'll find time in the next weeks to take care of it. It's obviously needed :)

pspanja avatar Jun 19 '19 09:06 pspanja

Hi @j13k, I've neglected this because I was to busy, but I'll find time in the next weeks to take care of it. It's obviously needed :)

That's fine @pspanja, I know what it's like! I'm struggling to find time for an outstanding PR on a small project that I maintain. ;)

Just so you know, it's a very useful library and really well-executed. I've implemented a set of generator classes that convert queries to Elasticsearch DSL.

j13k avatar Jun 20 '19 02:06 j13k

I agree with @j13k , this library was very useful for me to:

  1. validate the user-query (parsing the query from a string and building the corresponding tree)
  2. re-write the query-tree with the specific field names in SOLR (the user doesn't know the real fields the query will be executed on, like the specific SOLR pre- or sub- fixes)

thePanz avatar Jun 20 '19 07:06 thePanz

@thePanz @j13k feel free to PR whatever you think is generally useful :)

ilukac avatar Jun 21 '19 09:06 ilukac

Any plan to merge this PR? Thank you.

drigolin avatar Apr 22 '22 13:04 drigolin