neo4j
neo4j copied to clipboard
Expose db.index.fulltext index to Lucene API
The Lucene fulltext index capabilities are really great, but they're also limited. I'd love to get at the guts of the index, to pull out term frequency, document frequency, etc. as generated through the indexing process and accessible (in the general case) through Lucene APIs.
Extensions exposing more of the Lucene functionality through db.index.fulltext would be ideal.
I have tried to connect to full text indices via the Lucene 5.5 API, but search does not seem to work. If this can be done, it would be great to have some guidance on how to do it. If it's not possible to use the Lucene code directly against this index, modifying the index to make it possible would be a great help. thanks.
Can you pin down more precisely what features you would like to see? We can't expose database internals on principle, so every feature we add will need to have an API designed for it.
Fair question.
Ideally, I'd like to be able to take my own applications written against the Lucene API, and to connect directly to the Lucene index crated by db.index.fulltext. This would let me do what I want, without requiring a full-blown pass-through layer.
I assume this is what you mean that you can't do when you're saying that you "can't expose database internals on principle." If that's the case, then i'd really love to get at details of TopScore, TopScoreDocCollector, TermFreqVector, TermPositionVector, TermVectorOffsetInfo,etc...
Piling on here - yes it would be great to at least get the array of term positions in the string for highlighting etc :)
I second the highlighting use case. The lucene index has the metadata; it would be really useful to be able to access it with cypher. currently having to get back the results and determine manually. seems like search term highlighting wouldn't be an uncommon need for full text search.
+1 on the highlighting use case. Has there been any progress on this feature or is there a common workaround that you could recommend?