Marc Ubaldino
Marc Ubaldino
Could be language specific, genre specific, etc. I don't think the tagger can deal with this, unless you are willing to take as an input the set of sentence spans....
FYI -- Evan, David, greetings. http://opensextant.org/downloads.html -- The "Merged.txt" file likely has a new name, but it is there under "OpenSextant Gazetteer data", the latestGazetteer.zip file.
Hi Evan Smith, we can talk at my other email address [email protected]; Please include David, I don't have his contact info. Hm.... We're in Bedford, next door. Would like to...
Exactly! I've been struggling with this one as well! My experience: Docker invocation for `pelias import` works for `all` and all other routines, EXCEPT `wof`. I tried `bash -x pelias...
Add "text_norm" to indexer to review common false-pos still appearing.
Addressed in part by NonSenseFilter -- removing lowercase matches.
Seems more like gazetteer ETL fixes than a pattern generalization. If such trivial gazetteer entries should never be tagged, then we mark them `search_only=1`
Proof of concept: Download = jython.org Learn: https://wiki.python.org/jython/LearningJython configure, after Jython install, run from Xponents build: ``` export JAVA_OPTIONS="-Xmx2g -Xms2g -Dopensextant.solr=./solr/solr7" export CLASSPATH=./dist/Xponents-3.2/etc:./dist/Xponents-3.2/lib/* ``` Run: * `jython27/bin/jython` Example integration: ```...
Needs review with upcoming changes for Solr core setup in 4.6.1... agreed -- needs better configuration.
JVM property `opensextant.solr` is used to point to the `solr_home`, which should only be one per JVM. Use of these extraction libraries with a remote Solr server via URL instead...