pinot icon indicating copy to clipboard operation
pinot copied to clipboard

Support for S3A Connector

Open chrajeshbabu opened this issue 4 months ago • 0 comments

Currently conntroller and servers able to start with s3a path but while creating the segments during ingestion facing following error. The reason is while preparing file names we are prefixing the s3 scheme instead of s3a.

This will be useful to make use s3 compatible storages as a deep store.

Working on it.

Caused by: java.lang.IllegalStateException: Unable to extract out the relative path for input file 's3://testhadoop/pinot/batch/airlineStats/rawdata/2014/01/28/airlineStats_data_2014-01-28.avro', based on base input path: s3a://testhadoop/pinot/batch/airlineStats/rawdata/ at org.apache.pinot.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:515) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getRelativeOutputPath(SegmentGenerationUtils.java:162) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$submitSegmentGenTask$1(SegmentGenerationJobRunner.java:278) ~[pinot-batch-ingestion-standalone-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.base/java.lang.Thread.run(Thread.java:840) ~[?:?] java.lang.RuntimeException: Caught exception during running - org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:152) at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:125) at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:132) at org.apache.pinot.tools.Command.call(Command.java:33) at org.apache.pinot.tools.Command.call(Command.java:29) at picocli.CommandLine.executeUserObject(CommandLine.java:2045) at picocli.CommandLine.access$1500(CommandLine.java:148) at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465) at picocli.CommandLine$RunLast.handle(CommandLine.java:2457) at picocli.CommandLine$RunLast.handle(CommandLine.java:2419) at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277) at picocli.CommandLine$RunLast.execute(CommandLine.java:2421) at picocli.CommandLine.execute(CommandLine.java:2174) at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:173) at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:204) Caused by: java.lang.RuntimeException: Failed to generate Pinot segment for file - s3://testhadoop/pinot/batch/airlineStats/rawdata/2014/01/28/airlineStats_data_2014-01-28.avro at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$submitSegmentGenTask$1(SegmentGenerationJobRunner.java:287) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.lang.IllegalStateException: Unable to extract out the relative path for input file 's3://testhadoop/pinot/batch/airlineStats/rawdata/2014/01/28/airlineStats_data_2014-01-28.avro', based on base input path: s3a://testhadoop/pinot/batch/airlineStats/rawdata/ at org.apache.pinot.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:515) at org.apache.pinot.common.segment.generation.SegmentGenerationUtils.getRelativeOutputPath(SegmentGenerationUtils.java:162) at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$submitSegmentGenTask$1(SegmentGenerationJobRunner.java:278) ... 5 more

chrajeshbabu avatar Oct 26 '24 03:10 chrajeshbabu