[HUDI-4503] support for parsing identifier with catalog
What is the purpose of the pull request
This PR is going to support the identifier with a catalog.database.table format.
For Spark3 that support catalog, we can not just transform UnsolvedRelation to TableIdentifier directly.
Because in the cases that need to cooperate with the tables from the other catalog, it will block and throw an exception like https://github.com/apache/hudi/issues/6223#issuecomment-1198952333.
Brief change log
Apply spark.sessionState.analyzer.CatalogAndIdentifier to parse identifier whatever it has catalog or not.
Verify this pull request
TestSpark3Catalog UT
Committer checklist
-
[ ] Has a corresponding JIRA in PR title & commit
-
[ ] Commit message is descriptive of the change
-
[ ] CI is green
-
[ ] Necessary doc changes done or have another open PR
-
[ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
@alexeykudinkin please help to review this again.
@YannByron can you please also update the description to make sure it has relevant info?
CI report:
- 066e20303c737b6c0b441c5a92cb406ca45386ba Azure: SUCCESS
Bot commands
@hudi-bot supports the following commands:-
@hudi-bot run azurere-run the last Azure build
@YannByron @xushiyan I also started experimenting myself on this issue to see if my hunch is right that we can avoid pulling in more resolution logic from Spark into Hudi in: https://github.com/apache/hudi/pull/6361/files
Let's hold on merging this PR until we confirm whether we'd be able to avoid pulling in resolution logic and make things simpler in the end.