trino icon indicating copy to clipboard operation
trino copied to clipboard

Invalid position 2 in block with 2 positions with multiple filters on array(varchar)

Open shk3 opened this issue 9 months ago • 4 comments

Hi folks,

We are seeing java.lang.IllegalArgumentException: Invalid position 2 in block with 2 positions for queries that have multiple filters on an array(varchar) column.

To reproduce, create the table as the following:

create table tab26483 (
id bigint,
varchar_array array(varchar)
)
with (format = 'parquet');

insert into tab26483 values (1, array['1']), (2, array['1','2']), (3, array['1','a']), (4, array['2']), (5, array['1','b']);

Then the following queries will give the error:

select * from tab26483
where varchar_array = array['1'] or varchar_array = array['1','b'];

select * from tab26483
where varchar_array = array['1','a'] or varchar_array = array['1','b'];

Stacktrace:

java.lang.IllegalArgumentException: Invalid position 2 in block with 2 positions
	at io.trino.spi.block.BlockUtil.checkValidPosition(BlockUtil.java:72)
	at io.trino.spi.block.DictionaryBlock.getId(DictionaryBlock.java:573)
	at io.trino.spi.type.ArrayType.indeterminateOperator(ArrayType.java:674)
	at io.trino.$gen.PageFilter_20240510_080012_322.filter(Unknown Source)
	at io.trino.$gen.PageFilter_20240510_080012_322.filter(Unknown Source)
	at io.trino.operator.project.DictionaryAwarePageFilter.filter(DictionaryAwarePageFilter.java:82)
	at io.trino.operator.project.PageProcessor.createWorkProcessor(PageProcessor.java:119)
	at io.trino.operator.ScanFilterAndProjectOperator$SplitToPages.lambda$processPageSource$1(ScanFilterAndProjectOperator.java:284)
	at io.trino.operator.WorkProcessorUtils.lambda$flatMap$4(WorkProcessorUtils.java:285)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:359)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils$BlockingProcess.process(WorkProcessorUtils.java:207)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.lambda$flatten$6(WorkProcessorUtils.java:317)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:359)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:255)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:133)
	at io.trino.operator.Driver.processInternal(Driver.java:403)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:306)
	at io.trino.operator.Driver.tryWithLock(Driver.java:709)
	at io.trino.operator.Driver.process(Driver.java:298)
	at io.trino.operator.Driver.processForDuration(Driver.java:269)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:890)
	at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:76)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:191)
	at io.trino.$gen.Trino_445____20240507_223138_2.run(Unknown Source)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:192)
	at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:174)
	at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:161)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)

We are seeing it on both 445 and 447.

Does anyone know how to debug or fix it? Thanks!

shk3 avatar May 10 '24 08:05 shk3

I can reproduce on current master with Iceberg connector. The actual cause relates to IN, the OR will be rewritten as IN. Using the select * from db.tab26483 where varchar_array IN (array['1'], array['1','b']); can reproduce as well.

chenjian2664 avatar May 10 '24 09:05 chenjian2664

cc @dain

findepi avatar May 10 '24 20:05 findepi

A little bit update on our side: We have found that version 429 doesn't have this bug and 436 has. Hope it might help. We haven't tested other versions though.

shk3 avatar May 13 '24 16:05 shk3

Hi @findepi @dain any updates on this bug? Hoping it can be addressed before the next Trino release.

rohanag12 avatar May 21 '24 00:05 rohanag12