Ujval Misra
Ujval Misra
GC data log buckets after archiving them for durable and durable relaxed modes so that we're not using 2x the disk space.
* Implement count-min-sketch (supposedly more space-efficient than count-sketch) * Time-adaptive sketch: decay accuracy & storage for older values (using inflation) * Integrate with atomic multilog for offline/online approximate queries. -...
Initial sketch-related API additions to `atomic_multilog`: ```C++ void add_sketch(sketch_name : str, field_name : str, filter_name : Optional[str]); void remove_sketch(sketch_name : str); size_t estimate_frequency(sketch_name : str, value : str); map get_heavy_hitters(sketch_name...
Support nullable types for records that may have missing values.
Support projections for cases where not all columns are required in the output of a query.
Expose Sketch API to be able to: * Add/remove a sketch on a field * Execute ad-hoc queries on frequencies * Evaluate approximate triggers in real-time Expose corresponding remote API.
Currently the client must read from and write to a table using a contiguous string buffer. Add an interface to the C++ and Python RPC clients to read and write...
If a filter or index was invalidated it shouldn't be valid again on recovery/restart.
On Linux machines the `vm.max_map_count` limit is hit very quickly (default on an ec2 `c4.8xlarge` instance is 65536). This needs to be manually increased for archival to be successful (since...
Improve allocation using pools and file consolidation.