presto icon indicating copy to clipboard operation
presto copied to clipboard

fix catalog config resource lost on hudi mor read

Open permanentstar opened this issue 2 years ago • 2 comments

PR https://github.com/prestodb/presto/pull/18115 involed hudi connector break, when setting 'hive.copy-on-first-write-configuration-enabled'=false will allow us to read hudi cow table, but still fail in mor read. In my case the viewfs mount settings(config throught hive.config.resources) were lost.

com.facebook.presto.spi.PrestoException: Error opening Hive split HudiSplit{baseFile=Optional[viewfs://ss-hadoop/project/ads/warehouse/test2.db/sh_hudi_11_mor_test1/dt=09-09/5fd6c84b-5192-4513-9d45-adeade831bb1-0_1-1-2_20220909175427537.parquet:0+435024], logFiles=[]} using org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat: ViewFs: Cannot initialize: Empty Mount table in config for viewfs://ss-hadoop/
	at com.facebook.presto.hudi.HudiRecordCursors.createRecordReader(HudiRecordCursors.java:117)
	at com.facebook.presto.hudi.HudiRecordCursors.lambda$createRealtimeRecordCursor$0(HudiRecordCursors.java:77)
	at com.facebook.presto.hive.authentication.UserGroupInformationUtils.lambda$executeActionInDoAs$0(UserGroupInformationUtils.java:29)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1726)
	at com.facebook.presto.hive.authentication.UserGroupInformationUtils.executeActionInDoAs(UserGroupInformationUtils.java:27)
	at com.facebook.presto.hive.authentication.ImpersonatingHdfsAuthentication.doAs(ImpersonatingHdfsAuthentication.java:39)
	at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:81)
	at com.facebook.presto.hudi.HudiRecordCursors.createRealtimeRecordCursor(HudiRecordCursors.java:76)
	at com.facebook.presto.hudi.HudiPageSourceProvider.createPageSource(HudiPageSourceProvider.java:127)

On presto worker, PrestoFileSystemCache will initialize uri cache when first invoke FileSystem::get, and the invoke param Configuration here will be bundled with the FileSystem instance. While hdfsEnvironment::getConfiguration will return a CachingJobConf which wrapped by JobConf in HudiRecordCursors::createRecordReader will lost custom catalog properties 'hive.config.resources'. We can fix it by calling hdfsEnvironment::getFilesystem in advance which will trigger CachingJobConf::createFileSystem to use full configured resources.

Node: this bug can only be reproduced on standalone worker, bcz in debug mode(node-scheduler.include-coordinator=true), hdfsEnvironment::getFilesystem will be called in HudiSplitManager before, which will initialize the filesystem cache.

== NO RELEASE NOTE ==

permanentstar avatar Sep 16 '22 03:09 permanentstar

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: permanentstar / name: Suheng (4ceadd8e08440d3f612ccf3ec5aacc31e26a0af8)

Hi, @7c00 do you have time to review this PR, thanks :)

permanentstar avatar Sep 22 '22 08:09 permanentstar

@codope Presto and Trino have a quite different way from Hudi to handle Hadoop FileSystem instances. I think the problem might also be existing in Trino. Does Hudi have any plan to make it work better to handle FileSystems?

7c00 avatar Sep 27 '22 05:09 7c00

@permanentstar Does this PR https://github.com/prestodb/presto/pull/18441 help fix your issue?

pratyakshsharma avatar Oct 03 '22 17:10 pratyakshsharma

@pratyakshsharma PR #18441 solved problem on hive-connector which is a legacy read path on hudi datasource, while this PR is aimed at mor read on new hudi-connector way :)

permanentstar avatar Oct 10 '22 03:10 permanentstar

Based on the discussions in #18542 the changes look good. One small request, please make the first letter of the commit message in uppercase as per the guidelines here - https://cbea.ms/git-commit/.

Thank you for your contribution @permanentstar .

pratyakshsharma avatar Oct 27 '22 09:10 pratyakshsharma

@pratyakshsharma commit msg updated, but I notice one ci test failed by timeout, I think it has nothing business with this PR, can you check that, thanks.

permanentstar avatar Oct 28 '22 11:10 permanentstar

@permanentstar Yeah it should be fine, you can probably retrigger the tests by pushing the commit again.

pratyakshsharma avatar Oct 31 '22 08:10 pratyakshsharma

Hi @permanentstar, there are some test failures here, can you check? @pratyakshsharma Please give your explicit approval before we can send this to a committer for final review. Thanks!

ethanyzhang avatar Oct 31 '22 17:10 ethanyzhang

Hi @yzhang1991 I have tried repush several times, but every time the ci show different failing, I think the PR should only relate to hudi connector, and have no idea about why the random failure happens. May be you can give some help, thanks.

permanentstar avatar Oct 31 '22 18:10 permanentstar

After one more try, all ci test passed, thank you for your help @pratyakshsharma @yzhang1991

permanentstar avatar Nov 01 '22 03:11 permanentstar

Looks good to me. Thank you for taking care of all the comments patiently.

@prestodb/committers please take a pass.

pratyakshsharma avatar Nov 01 '22 12:11 pratyakshsharma

@prestodb/committers ping!

pratyakshsharma avatar Nov 15 '22 13:11 pratyakshsharma