presto
presto copied to clipboard
fix catalog config resource lost on hudi mor read
PR https://github.com/prestodb/presto/pull/18115 involed hudi connector break, when setting 'hive.copy-on-first-write-configuration-enabled'=false will allow us to read hudi cow table, but still fail in mor read.
In my case the viewfs mount settings(config throught hive.config.resources
) were lost.
com.facebook.presto.spi.PrestoException: Error opening Hive split HudiSplit{baseFile=Optional[viewfs://ss-hadoop/project/ads/warehouse/test2.db/sh_hudi_11_mor_test1/dt=09-09/5fd6c84b-5192-4513-9d45-adeade831bb1-0_1-1-2_20220909175427537.parquet:0+435024], logFiles=[]} using org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat: ViewFs: Cannot initialize: Empty Mount table in config for viewfs://ss-hadoop/
at com.facebook.presto.hudi.HudiRecordCursors.createRecordReader(HudiRecordCursors.java:117)
at com.facebook.presto.hudi.HudiRecordCursors.lambda$createRealtimeRecordCursor$0(HudiRecordCursors.java:77)
at com.facebook.presto.hive.authentication.UserGroupInformationUtils.lambda$executeActionInDoAs$0(UserGroupInformationUtils.java:29)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
at com.facebook.presto.hive.authentication.UserGroupInformationUtils.executeActionInDoAs(UserGroupInformationUtils.java:27)
at com.facebook.presto.hive.authentication.ImpersonatingHdfsAuthentication.doAs(ImpersonatingHdfsAuthentication.java:39)
at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:81)
at com.facebook.presto.hudi.HudiRecordCursors.createRealtimeRecordCursor(HudiRecordCursors.java:76)
at com.facebook.presto.hudi.HudiPageSourceProvider.createPageSource(HudiPageSourceProvider.java:127)
On presto worker, PrestoFileSystemCache will initialize uri cache when first invoke FileSystem::get
, and the invoke param Configuration here will be bundled with the FileSystem instance. While hdfsEnvironment::getConfiguration
will return a CachingJobConf which wrapped by JobConf in HudiRecordCursors::createRecordReader
will lost custom catalog properties 'hive.config.resources'. We can fix it by calling hdfsEnvironment::getFilesystem in advance which will trigger CachingJobConf::createFileSystem
to use full configured resources.
Node: this bug can only be reproduced on standalone worker, bcz in debug mode(node-scheduler.include-coordinator=true), hdfsEnvironment::getFilesystem
will be called in HudiSplitManager before, which will initialize the filesystem cache.
== NO RELEASE NOTE ==
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: permanentstar / name: Suheng (4ceadd8e08440d3f612ccf3ec5aacc31e26a0af8)
Hi, @7c00 do you have time to review this PR, thanks :)
@codope Presto and Trino have a quite different way from Hudi to handle Hadoop FileSystem instances. I think the problem might also be existing in Trino. Does Hudi have any plan to make it work better to handle FileSystems?
@permanentstar Does this PR https://github.com/prestodb/presto/pull/18441 help fix your issue?
@pratyakshsharma PR #18441 solved problem on hive-connector which is a legacy read path on hudi datasource, while this PR is aimed at mor read on new hudi-connector way :)
Based on the discussions in #18542 the changes look good. One small request, please make the first letter of the commit message in uppercase as per the guidelines here - https://cbea.ms/git-commit/.
Thank you for your contribution @permanentstar .
@pratyakshsharma commit msg updated, but I notice one ci test failed by timeout, I think it has nothing business with this PR, can you check that, thanks.
@permanentstar Yeah it should be fine, you can probably retrigger the tests by pushing the commit again.
Hi @permanentstar, there are some test failures here, can you check? @pratyakshsharma Please give your explicit approval before we can send this to a committer for final review. Thanks!
Hi @yzhang1991 I have tried repush several times, but every time the ci show different failing, I think the PR should only relate to hudi connector, and have no idea about why the random failure happens. May be you can give some help, thanks.
After one more try, all ci test passed, thank you for your help @pratyakshsharma @yzhang1991
Looks good to me. Thank you for taking care of all the comments patiently.
@prestodb/committers please take a pass.
@prestodb/committers ping!