pinot
pinot copied to clipboard
Explore to use Trie Tree to speed up both json_match and json_extract_index transform function for mutable json indexing segment
I notice that the postingListMap is changed from hashMap to TreeMap in this PR: https://github.com/apache/pinot/pull/12568
jsonMatch performs a point search on the _postingListMap wheras jsonExtractIndex perform a prefix search on the _postingListMap. The treeMap can speed the prefix search but slow down the point search.
Ideally, we can use TrieTree to speed up both function. Moreover, we don’t have to store the literal “$index” inside the key of postingListMap. last but not least, we can use latch crabbing to increase the concurrency. the existing read and write lock the whole map, which is a bottleneck of throughput.
A rough idea of the Data structure below:
TrieNode {
boolean arrayIndex;
String subPath;
Map<String, TrieNode> kids;
}