Benjamin Geer comments

Results 225 comments of


                                            Benjamin Geer

Experiment with supporting eXist-db

To make the import faster, I added a config option to prevent Knora from verifying the data after it's written.

Experiment with supporting eXist-db

I did some tests with the books split into 1000-word fragments, and it doesn't make `knora-api:matchInStandoff` any faster, because if you search for a common word, the query still has...

Experiment with supporting eXist-db

Looking at that now.

Experiment with supporting eXist-db

It looks like the GraphDB Lucene connector can't do this. It just indexes whatever strings you give it, but in this case we would want to index substrings at specific...

Experiment with supporting eXist-db

A drawback of XQuery is that I don't see any way to search within overlapping hierarchies. For example, given this document from one of our tests: ```xml Scorn not the...

Experiment with supporting eXist-db

I think I found an eXist-db function for this: http://exist-db.org/exist/apps/fundocs/view.html?uri=http://exist-db.org/xquery/util&location=java:org.exist.xquery.functions.util.UtilModule&details=true#get-fragment-between.4

Experiment with supporting eXist-db

That function seems to be buggy: https://github.com/eXist-db/exist/issues/2316

Experiment with supporting eXist-db

More implementations here: https://wiki.tei-c.org/index.php/Milestone-chunk.xquery

Experiment with supporting eXist-db

In any case, I would expect this to be very slow, because you can't make a Lucene index for the content between two milestones, only for the content of an...

Experiment with supporting eXist-db

Full-text search in different open-source XML databases: | XML database | Full-text search | |----------------------------------|--------------------------------------------------------------------------------------------------------------------------| | [eXist-db](https://exist-db.org) | [implementation-specific full-text search feature](https://exist-db.org/exist/apps/doc/xquery#full-text) based on Lucene | | [BaseX](http://basex.org) | W3C...