Benjamin Geer

Results 225 comments of Benjamin Geer

To make the import faster, I added a config option to prevent Knora from verifying the data after it's written.

I did some tests with the books split into 1000-word fragments, and it doesn't make `knora-api:matchInStandoff` any faster, because if you search for a common word, the query still has...

It looks like the GraphDB Lucene connector can't do this. It just indexes whatever strings you give it, but in this case we would want to index substrings at specific...

A drawback of XQuery is that I don't see any way to search within overlapping hierarchies. For example, given this document from one of our tests: ```xml Scorn not the...

I think I found an eXist-db function for this: http://exist-db.org/exist/apps/fundocs/view.html?uri=http://exist-db.org/xquery/util&location=java:org.exist.xquery.functions.util.UtilModule&details=true#get-fragment-between.4

That function seems to be buggy: https://github.com/eXist-db/exist/issues/2316

More implementations here: https://wiki.tei-c.org/index.php/Milestone-chunk.xquery

In any case, I would expect this to be very slow, because you can't make a Lucene index for the content between two milestones, only for the content of an...

Full-text search in different open-source XML databases: | XML database | Full-text search | |----------------------------------|--------------------------------------------------------------------------------------------------------------------------| | [eXist-db](https://exist-db.org) | [implementation-specific full-text search feature](https://exist-db.org/exist/apps/doc/xquery#full-text) based on Lucene | | [BaseX](http://basex.org) | W3C...