Vivek Iyer Vaidyanathan

Results 9 issues of Vivek Iyer Vaidyanathan

To make our query processing pipeline resilient to failures (like server slowness, crashes, etc), we can make some improvements in the broker and server components. I am working on a...

feature
In Progress

In PR https://github.com/apache/pinot/pull/8907, `RequestIdentity` is assumed to be of type `HttpRequestIdentity`. Using `GrpcRequesterIdentity` will result in error like cannot be cast to class `org.apache.pinot.broker.api.HttpRequesterIdentity`. Potential fix is to have a...

label=feature Currently, in the segment reload path, we support the following operations: 1. Adding a new column, removing/updating an autogenerated column 2. Add remove various indexes like inverted index, json...

feature
In Progress

The `INTERVAL` datatype will allow users to manipulate a period of time in years, months, days, hours, minutes, seconds, etc Example of use: ``` SELECT now(), now() - INTERVAL '1...

feature

Currently, pinot only supports using `COUNT` aggregation function on `DISTINCT`. This is supported in two ways: 1. DISTINCTCOUNT 2. COUNT(DISTINCT colA) https://github.com/apache/pinot/blob/e813867985746e916c8e898a530002551b661496/pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java#L788-L789 The ask in this issue to make pinot...

Currently, we support a number of preprocessing operations for a segment in response to schema/tableConfig changes. Some of them are: 1. Add a new column. Remove/Modify an autogenerated column. 2....

* Applies interning for OnHeapByteDictionary. Please refer to https://github.com/apache/pinot/pull/12223 for more details. * Addresses some pending review comments from https://github.com/apache/pinot/pull/12223

documentation
Configuration
performance

Currently, when new columns are added or indexes are added/removed, the segment reloads happen on the server. There are a number of issues with this approach: 1. Increased startup times...

feature

Our `OnHeapStringDictionary` implementation can result in a lot of wasted heap usage if there are enough duplicates in a column. Below is JXray analysis of the heapdump for one usecase...

enhancement
performance