Lars Marius Garshol comments

Results 290 comments of


                                            Lars Marius Garshol

Upgrade to Lucene 4.8

_From [[email protected]](https://code.google.com/u/106380900043315593284/) on March 16, 2013 04:40:01_ **Blocking:** -duke:26

Upgrade to Lucene 4.8

This is becoming more urgent as new Lucene versions are released. Need to take a serious new look at this.

Genetic algorithm doesn't close JDBC connections

That's odd. I use it with the JDBC data source all the time, with no problem. Could you post your configuration, so we can see if there's something unusual there?

Let genetic algorithm use custom comparators

The trouble with implementing this one is that the config loading is so generic we really have no idea what's declared as s. It may be that reading the config...

Add a longest common subsequence comparator

_From [[email protected]](https://code.google.com/u/106380900043315593284/) on January 25, 2013 05:41:48_ Algorithm reference: http://www.algorithmist.com/index.php/Longest_Common_Subsequence

Add a longest common subsequence comparator

_From [[email protected]](https://code.google.com/u/106380900043315593284/) on October 23, 2013 10:15:45_ Longest common subSTRING has been implemented, but longest common subsequence is actually a different comparator, so that's still not done.

Support for deduplicating data directly from Lucene/Solr/ElasticSearch

_From [[email protected]](https://code.google.com/u/116193730723037190676/) on September 25, 2013 05:37:20_ It would be great feature , the deduplication functionality can be integrated with Apache Solr and works as yet another REST API.

Support for deduplicating data directly from Lucene/Solr/ElasticSearch

_From [[email protected]](https://code.google.com/u/106380900043315593284/) on September 25, 2013 11:22:35_ For ElasticSearch there is actually a module for this: https://github.com/YannBrrd/elasticsearch-entity-resolution It might be an idea to make something similar for Solr. Or one...

Try alternative evolutionary algorithms

Doing this, but unfortunately it turns out that we need _many_ experiments before we can conclude anything with any certainty.

Few questions about Duke.

Hi there, 1) I could put up some benchmarks, but IMHO they would be useless. Accuracy varies with the data available and the amount of noise in the data. 2)...