Jan Niestadt

Results 154 comments of Jan Niestadt

BTW I had a look at CWB and Sketch Engine to see what the most compatible syntax would be. Sketch Engine has a [`meet`](https://www.sketchengine.eu/documentation/cql-meet-union/) function that does something similar, e.g.:...

(More or less) "pluggable" extension functions have been implemented in the feature/relations branch, so this should probably be done there as well. We need to add support for `list()` to...

Doesn't the fact that globalOccurrences is a ConcurrentHashMap mean this is actually safe? Not 100% sure, so using Atomic* is probably a good idea. Hopefully this doesn't slow things down...

But the [documentation](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#compute-K-java.util.function.BiFunction-) for compute() states "The entire method invocation is performed atomically.", and no other operations can occur at this time, right? (docIds.parallellStream() only calls compute() on globalOccurrences). Again,...

I understand your confusion; because JSON objects are unordered, the results aren't sorted in any logical way (alphabetical or by frequency). An alternative to using `termfreq` is to find all...

Multiple values are now supported, see #393 and #394. Using processing steps on annotations or `standoffAnnotations` produces an error. Those can likely be done in XPath 3, so therefore wouldn't...

Testing on the current dev version, the only problem left appeared to be that regular term (range) queries were produced for all field types, even numeric ones. I've now subclassed...

Essentially the same problem but with a different type of query: containing A:("cow"|"hare") or containing (A:"cow"|B:"hare") This query will return hits capturing either **cow** or **hare**. But a sentence containing...

We're now integrating capture groups (and other match info) into regular hits, which will allow us to be stricter about uniqueness. This should solve this issue. See branch experiment/unify-captures-relations (or...