tantivy
tantivy copied to clipboard
Document index file formats - Issue/981
related to #981
This is a work in progress and is there to have early feedbacks on the documentation structure.
I like the lucene way of describing the file formats (it is not proper to lucene actually I have seen it elsewhere). e.g.
https://lucene.apache.org/core/3_0_3/fileformats.html#Segments%20File
label --->
I like the lucene way of describing the file formats (it is not proper to lucene actually I have seen it elsewhere). e.g.
https://lucene.apache.org/core/3_0_3/fileformats.html#Segments%20File
label ---> ^N means B repeated N times.
yes, I finally understand it and will use it, I really wanted to not keep the current format!
Hi, thanks for your effort! Would it be possible to strictly separate "data structure (data type)" and its "description"? I mean, relatively recent Lucene format documentation is written as this. https://lucene.apache.org/core/8_8_1/core/org/apache/lucene/codecs/lucene84/Lucene84PostingsFormat.html#Termdictionary
Jfyi, I wanted to show one bad example... old Lucene file format documentation is mixed up with various information; unfortunately it has become really difficult to understand with its growth. https://lucene.apache.org/core/8_8_1/core/org/apache/lucene/codecs/lucene50/Lucene50TermVectorsFormat.html
I think this is a great way of describing a format: https://github.com/mocobeta/lucene-postings-format, well done @mocobeta