timecraft
timecraft copied to clipboard
Log indexing
Some notes I gathered relevant to this feature while working on #232
- Only relevant field to index are
(*Record).Time,(*Record).FunctionIDand(*Record).FunctionCallI don't see any merits in indexingFunctionCallwill someone query logs by the value of the syscall argument ? - The actual log data
(*Record).FunctionCallis not structured and is in custom encoding between features. Indexing should be context aware ( features should be responsible for indexing their own data) (*Record).Offsetis not stored on the segment. It is dynamically set while reading . This limits how much you can skip when querying batches. When you create inverted index that finds batches with relevant logs you will be forced to potentially read the full batch and filter relevant logs in memory.Recordis coupled with syscall
Potentially as current api stand maybe indexing timestamps (*Record).Time will make sense and allow commands to accept -start-ts and --end-ts . When reading logs we can skip batches that have no records in the time time range
note : these notes can be incorrect , they come from my limited time hacking on something different.