Benjamin Geer
Benjamin Geer
Or I guess I could just make an index on the root XML element, which is in effect what Knora does.
Ah, wait a minute, these configuration "files" are actually XML documents. So it wouldn't be a problem to create one per project. http://exist-db.org/exist/apps/doc/indexing#idxconf
With Lucene indexes on ``, ``, and ``, uploading is somewhat slower: ``` [info] Uploaded wealth-of-nations (6135800 bytes, 9534 ms) [info] Uploaded the-count-of-monte-cristo (8071112 bytes, 13063 ms) [info] Uploaded sherlock-holmes...
## Searches optimised with project-specific Lucene indexes Search for `Paphlagonian`: 1 result in 0.05 second: ```xquery for $par in collection("/db/books")//p[.//adj[ft:query(., "paphlagonian")]] group by $doc := util:document-name(root($par)) return {$par} ``` Search...
Search for `full` and `Euchenor` in the same paragraph: query never terminates, uses 100% CPU, and results in an OutOfMemoryError: ```xquery for $par in collection("/db/books")//p[.//adj[ft:query(., "full")] and ..//noun[ft:query(., "Euchenor")]] group...
eXist then eventually restarts. Aha, now I see that there is a bug in my query (`..` instead of `.`).
Search for full and Euchenor in the same paragraph (fixed query): 1 result in 1.6 seconds: ```xquery for $par in collection("/db/books")//p[.//adj[ft:query(., "full")] and .//noun[ft:query(., "Euchenor")]] group by $doc := util:document-name(root($par))...
> So is it more efficient than Knora? Yes, in this use case: * Very large texts (up to 13 MB of XML per document) with very dense markup *...
If I split each book into small fragments (e.g. 1000 words each), I can make a structure where each `Book` resource has hundreds of links to `BookFragment` resources (via a...
Also, it takes a lot longer to import all this data into Knora than to import it into eXists. I'm importing the same books as in https://github.com/dasch-swiss/knora-api/issues/1570#issuecomment-571533227 into Knora (fragmented),...