hudi [SUPPORT] Hudi 0.13.1 on EMR, MOR table writer hangs intermittently with S3 read timeout error for column stats index

Describe the problem you faced

Stack: Hudi 0.13.1, EMR 6.13.0, Spark 3.4.1

We are writing to an MOR table in S3, using Spark Structured Streaming job on EMR. Once this job has run for a while (like 12+ hours) we notice that at random times, there are long periods of time with no operations/commits on the timeline. Usually, the offending executor is removed and the task retried, but eventually, it stalls and the application exits with fatal errors on the driver.

We are using deployment model B: Single writer with async services + OCC for metadata table. We also tried using DynamoDB based lock provider (in case OCC was the culprit) and the same thing happened.

Sometimes, a deltacommit, clean or compaction is inflight but never completed, and our logs are reporting the same error over and over:

(on Spark driver stderr)

The timeout always occurs while trying to read column stats.

Meanwhile, the timeline shows no progress:

Notice that the deltacommit inflight at 10:20:01 AM is stuck (all prior deltacommits were finished within 45 seconds). The micro-batch interval is 5 mins.

We can see no active tasks on the Spark Web UI:

In some instances, the tasks will succeed eventually and the job will progress ahead, until it gets stuck again with the same errors.

Here are all our Hudi configs:

hoodie.table.version -> 5 hoodie.datasource.write.hive_style_partitioning -> True hoodie.datasource.hive_sync.enable -> True hoodie.datasource.hive_sync.auto_create_database -> True hoodie.datasource.hive_sync.skip_ro_suffix -> True hoodie.parquet.small.file.limit -> 104857600 hoodie.parquet.max.file.size -> 125829120 hoodie.compact.inline.trigger.strategy -> NUM_OR_TIME hoodie.compact.inline.max.delta.commits -> 3 hoodie.compact.inline.max.delta.seconds -> 600 hoodie.parquet.compression.codec -> snappy hoodie.clean.automatic -> True hoodie.index.type -> BLOOM hoodie.bloom.index.use.metadata -> True hoodie.metadata.enable -> True hoodie.metadata.index.bloom.filter.enable -> True hoodie.metadata.index.column.stats.enable -> True hoodie.keep.max.commits -> 50 hoodie.archive.automatic -> True hoodie.archive.beyond.savepoint -> True hoodie.metrics.on -> True hoodie.metadata.metrics.enable -> True hoodie.metrics.executor.enable -> True hoodie.metrics.reporter.type -> GRAPHITE hoodie.metrics.graphite.host -> <Graphite-IP> hoodie.metrics.graphite.port -> 2003 hoodie.metrics.graphite.report.period.seconds -> 30 hoodie.metrics.graphite.metric.prefix -> test_prefix_demo_mor hoodie.cleaner.policy.failed.writes -> LAZY hoodie.write.concurrency.mode -> OPTIMISTIC_CONCURRENCY_CONTROL hoodie.write.lock.provider -> org.apache.hudi.client.transaction.lock.InProcessLockProvider hoodie.metrics.lock.enable -> True hoodie.clean.async -> True hoodie.archive.async -> True hoodie.metadata.index.async -> False hoodie.metadata.clean.async -> False hoodie.cleaner.policy -> KEEP_LATEST_BY_HOURS hoodie.cleaner.hours.retained -> 1 hoodie.datasource.write.table.name -> table_0 hoodie.table.name -> table_0 hoodie.datasource.write.table.type -> MERGE_ON_READ hoodie.datasource.write.recordkey.field -> id hoodie.datasource.write.partitionpath.field -> _year hoodie.datasource.write.precombine.field -> _kafka_offset hoodie.datasource.hive_sync.table -> table_0 hoodie.datasource.hive_sync.database -> db-name hoodie.base.path -> s3://bucket-name/db-name/table_0 hoodie.copyonwrite.record.size.estimate -> 218 hoodie.parquet.block.size -> 125829120 hoodie.parquet.page.size -> 104876 hoodie.datasource.write.keygenerator.class -> org.apache.hudi.keygen.SimpleKeyGenerator hoodie.datasource.hive_sync.partition_extractor_class -> org.apache.hudi.hive.MultiPartKeysValueExtractor hoodie.datasource.write.streaming.checkpoint.identifier -> table_0_0 path -> s3://bucket-name/db-name/table_0 checkpointLocation -> s3://bucket-name/db-name/table_0/checkpoints/

To Reproduce

Steps to reproduce the behavior:

Start incremental upserts to an MOR table on S3 using the above configs on the mentioned version of EMR + Hudi + Spark

Expected behavior

Writer continues to run as it has usually, without getting stuck.

Environment Description

Hudi version : 0.13.1
Spark version : 3.4.1
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : no

Stacktrace

23/12/27 10:55:00 INFO NetworkClient: [AdminClient clientId=adminclient-1] Node 6 disconnected.
23/12/27 10:55:00 INFO MultipartUploadOutputStream: close closed:false s3://ecom-data-lake-raw/test_run_demo_7/pickup_manifests/.hoodie/.heartbeat/20231227102000654
23/12/27 10:55:06 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.51_1-6493-25610; Range: null
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:16 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.51_1-6493-25610; Range: [11825, 11836]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:16 INFO S3NativeFileSystem: Opening 's3://ecom-data-lake-raw/test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.1_1-4901-17248' for reading
23/12/27 10:55:26 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.1_1-4901-17248; Range: [7739, 11846]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:36 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.1_1-4901-17248; Range: [6233, 11846]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:46 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.1_1-4901-17248; Range: [5292, 5324]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:56 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.1_1-4901-17248; Range: [5325, 6265]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:55:56 INFO S3NativeFileSystem: Opening 's3://ecom-data-lake-raw/test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.2_1-4947-17505' for reading
23/12/27 10:56:00 INFO MultipartUploadOutputStream: close closed:false s3://ecom-data-lake-raw/test_run_demo_7/pickup_manifests/.hoodie/.heartbeat/20231227102000654
23/12/27 10:56:06 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.2_1-4947-17505; Range: [7738, 11845]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:56:16 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.2_1-4947-17505; Range: [6232, 11845]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:56:26 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.2_1-4947-17505; Range: [5292, 5324]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
23/12/27 10:56:36 INFO GlobalS3Executor: ReadTimeout File: test_run_demo_7/pickup_manifests/.hoodie/metadata/column_stats/.col-stats-0001_20231227075500652001.log.2_1-4947-17505; Range: [5325, 6264]
Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

Dec 27 '23 05:12 ergophobiac

@ergophobiac Are you setting fs.s3a.connection.maximum to a higher value? Can you try increasing its value and try?

Dec 27 '23 12:12 ad1happy2go

@ergophobiac Did you got a chance to try this out?

Jan 02 '24 14:01 ad1happy2go

Hey @ad1happy2go, we have a test case running, we'll observe till we're sure it's stable and let you know how it turns out.

Jan 02 '24 15:01 ergophobiac

Hello @ad1happy2go , We ran a test with the same configurations, just one addition: spark.hadoop.fs.s3a.connection.maximum=2000. (We found a resource saying the default on EMR is 50).

Ran into the same error, application failed 3 days after starting. We have disabled multi-modal indexing for now.

Jan 11 '24 05:01 ergophobiac

Thanks for trying @ergophobiac. @CTTY any insights here ?

Jan 31 '24 12:01 ad1happy2go

This looks similar to this issue: https://github.com/apache/hudi/issues/7487 where user ran into S3 throttling issue due to too many S3 calls.

Was wondering if you can check if there are too many 503 error code from your S3 bucket?

Mar 04 '24 23:03 CTTY

hudi hudi copied to clipboard

[SUPPORT] Hudi 0.13.1 on EMR, MOR table writer hangs intermittently with S3 read timeout error for column stats index

hudi
hudi copied to clipboard