BlackLab icon indicating copy to clipboard operation
BlackLab copied to clipboard

Linguistic search for large annotated text corpora, based on Apache Lucene

Results 103 BlackLab issues
Sort by recently updated
recently updated
newest added

Bumps [browserify-sign](https://github.com/crypto-browserify/browserify-sign) from 4.2.1 to 4.2.2. Changelog Sourced from browserify-sign's changelog. v4.2.2 - 2023-10-25 Fixed [Tests] log when openssl doesn't support cipher [#37](https://github.com/crypto-browserify/browserify-sign/issues/37) Commits Only apps should have lockfiles 09a8995...

dependencies
javascript

Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.21.2 to 7.23.2. Release notes Sourced from @​babel/traverse's releases. v7.23.2 (2023-10-11) NOTE: This release also re-publishes @babel/core, even if it does not appear in the linked release...

dependencies
javascript

In chn-intern, running the TermSerialization tool finds terms that don't correctly "round-trip" (i.e. get the id for the term, then get the term for that id again), although not too...

bug

See e.g. https://lucene.apache.org/core/8_7_0/core/org/apache/lucene/util/automaton/RegExp.html#COMPLEMENT :+1: > The reserved characters used in the (enabled) syntax must be escaped with backslash (\) or double-quotes ("..."). (In contrast to other regexp syntaxes, this is...

bug
refactor

If you try to access an index that Blacklab cannot read because it was indexed with an older version, it crashes instead of returning a clean error response. (e.g. try...

bug

The experimental Solr module doesn't yet have support for creating/deleting private user indexes and formats and adding documents. This functionality should be added eventually.

proxy
solr

For supporting custom hit-level annotations added by individual users (stored in a separate database), as well as parallel corpora, it would be very useful to be able to give a...

enhancement

BlackLab (and CQL) don't currently support ordinary "near searches", e.g. "find _dog_, _cat_ and _hamster_ within 20 words of each other". Lucene does support these kinds of searches though, even...

enhancement

It seems clients cannot reliably reconstruct the punct from XML responses, as it's joined with the identation. https://github.com/INL/BlackLab/blob/72194b794e03e87c10d406b0e3e37ba8373a6aa3/server/src/main/java/nl/inl/blacklab/server/datastream/DataStreamXml.java#L232-L235

webservice

It would be interesting to see how easy (or not) it would be to implement scoring on hits, including things like term boosting, norms, etc. We've pretty much ignored this...

enhancement