Gian Merlino

Results 33 issues of Gian Merlino

There's no need to create it upfront, because it will be created on demand if we actually need it. And, if we don't end up needing it (because the sort...

Area - Batch Ingestion
Area - MSQ

# Motivation Druid 24 included a task-based multi-stage query engine proposed in #12262. This has proved useful for [DML (REPLACE, INSERT)](https://druid.apache.org/docs/latest/multi-stage-query/) and [querying directly from deep storage](https://druid.apache.org/docs/latest/querying/query-deep-storage). This proposal is...

Design Review
Proposal
Area - MSQ

In BroadcastJoinSegmentMapFnProcessor, use FrameBasedInlineDataSource and FrameBasedIndexedTable to back broadcast joinables, rather than a regular InlineDataSource (which would use Java object arrays). Reduces memory usage and eliminates a copy while building...

Area - Batch Ingestion
Area - Segment Format and Ser/De
Area - MSQ

Treats SQL NULL types as strings at the native layer. This is consistent with how we treat unknown-type nulls in other contexts.

Area - Querying

It's useful for the fault message to have some description of the underlying error: why did the worker fail, or why did the RPC call fail? This patch updates the...

Area - Batch Ingestion
Area - MSQ

There are a couple of CVE out for 3.9.0.

Area - Dependencies

We've been seeing a high degree of segfaults recently in the `processing` unit tests under JDK 21. Here's an example: https://github.com/apache/druid/actions/runs/11528594002/job/32166241444?pr=17414. The common thread is an error like this: ```...

Bug

The SQL planner now avoids using extractionFns unless "sqlUseExtractionFns" is set to "true" is set. This promotes more usage of query vectorization. This affects the SQL functions LOOKUP, REGEXP_EXTRACT, and...

Area - Documentation
Area - Querying

For understanding and optimizing data footprint, it's valuable to know the footprint (in bytes) and capabilities (dictionary encoded, has index, etc) of each column on a per-segment basis and also...

Feature/Change Description

This patch updates things so the `query/bytes` metric, and the field in logged requests, are both included for failed requests. This helps see how many bytes were written before a...

Area - Querying