Duke icon indicating copy to clipboard operation
Duke copied to clipboard

Extract generic Estimator interface instead of a specific of WeightEstimator nested in WeightLevenstein comparator

Open YannBrrd opened this issue 8 years ago • 5 comments

Hi,

What do you think of extracting a generic Estimator interface so that we could handle this just as other objects (parsing, instanciating, ...) ? Having it nested in WeightLevenstein comparator makes it a specialized code where a generic approach would look cleaner (WeightEstimator as an implementation of public Estilatior interface, to feed WeightLevenstein comparator).

What do you think ?

Cheers, Yann

YannBrrd avatar May 01 '16 06:05 YannBrrd

For sure I think thats a great idea. This was very much a hack-n-slash so we could start using WeightedLevenshtein right away.

I am also working on a Levenshtein-Damarau estimator for Duke which I will also need to use later on as an estimator so cleanup is probably a good idea.

ghost avatar May 02 '16 16:05 ghost

Hi Yann,

What does a generic estimator do? I mean, what's the interface and how is it used?

Is this about coming up with a more general concept, or just moving the estimator out of WeightedLevenshtein?

@AbsoluteResults Are you thinking you want to be able to implement a WeightedLevenshteinDamerau? If so, we can move the WeightEstimator out, and allow you to extend it for WLD.

larsga avatar May 02 '16 17:05 larsga

It's about moving it out and providing a way to instantiate it like comparator and cleaners, plus setting parameters.

Le lun. 2 mai 2016 19:08, Lars Marius Garshol [email protected] a écrit :

Hi Yann,

What does a generic estimator do? I mean, what's the interface and how is it used?

Is this about coming up with a more general concept, or just moving the estimator out of WeightedLevenshtein?

@AbsoluteResults https://github.com/AbsoluteResults Are you thinking you want to be able to implement a WeightedLevenshteinDamerau? If so, we can move the WeightEstimator out, and allow you to extend it for WLD.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/larsga/Duke/issues/225#issuecomment-216294656

Cordialement, Yann Barraud

YannBrrd avatar May 02 '16 21:05 YannBrrd

Well, instantiating and configuring it in the config is already easy:

  <object class="no.priv.garshol.duke.comparators.WeightedLevenshtein$DefaultWeightEstimator"
          name="estimator">
    <param name="digit-weight" value="10.0"/>
  </object>
  <object class="no.priv.garshol.duke.comparators.WeightedLevenshtein"
          name="address-comp">
    <param name="estimator" value="estimator"/>
  </object>

So there's no reason to move it out just for that.

larsga avatar May 03 '16 05:05 larsga

Sorry everyone this email got caught in Clutter....

Yes I would like to implement WeightedLevenshteinDamerau as well as another weighted function for QWERTY keyboard distances, both to be eventually used in the @YannBrrd project in Elasticsearch.

I don't have a timeline of when I can start coding it though so don't abstract it out as a high priority on our behalf. If its on the schedule or a quick update I'll take advantage of it of course, but I might be able to stick it in the Duke code without it.

Thanks for the suggestions!

ghost avatar May 31 '16 22:05 ghost