strus
strus copied to clipboard
big token positions
2016-08-09 10:52:22; strusWebService, error: Token positions of document 693-2009 are out or range (document too big, only 76263 token positions were assigned, maximum allowed position is %65535) (master.cpp:96)
An idea is to have small, big, very big positions in the index. Simply dropping the positions is not really good. The document is a big PDF, but splitting it creates a clustering and a "too small retrieval item" problem.
The problem is due to a limit in the blocks storing positions in the storage. I agree that this must be fixed.