Pip packages are not installed for Iceberg source plugin with Hive type
Describe the bug Pip packages are not installed for Iceberg source plugin with hive type
To Reproduce Steps to reproduce the behavior:
- Create Iceberg ingestion source by the example below but without
type: hive - Click
Save & Run - Get an error about required
typefield - Update the recipe with
type: hive - Click
Save & Runbutton - See the errors (logs are below):
- ModuleNotFoundError: No module named 'thrift'
- pyiceberg.exceptions.NotInstalledError: Apache Hive support not installed: pip install 'pyiceberg[hive]'
Expected behavior
All pypi packages 'pyiceberg[hive]' thrift should be installed properly
Solution
Execute pip install every time before execution of recipe
Screenshots
Desktop (please complete the following information):
- OS: MacOS Sonoma arm64
- Browser Chrome
- Version 122.0.6261.112
Additional context
- Recipe:
source:
type: iceberg
config:
env: PROD
catalog:
name: iceberg-catalog
type: hive
config:
uri: 'https://hostname1:9083'
s3.endpoint: 'https://hostname2'
s3.access-key-id: '${secret1}'
s3.secret-access-key: '${secret2}'
table_pattern:
allow:
- 'test.*'
profiling:
enabled: false
- Error logs: exec-urn_li_dataHubExecutionRequest_1d2b870e-81e9-477a-8869-39505a9f2b3d.log
- Even adding Extra Pip Libraries does not help
- Datahub version 0.12.1
- As I see, it is not fixed in 0.13.0 from 0.12.1 https://github.com/datahub-project/datahub/commits/v0.12.1/metadata-ingestion/src/datahub/ingestion/source/iceberg
@usmanovbf would you be open to sending a PR for this?
@hsheth2 sorry, I have no time for now. Hope you or your teammate will find some time to fix it
might be related to #10289
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
This issue was closed because it has been inactive for 30 days since being marked as stale.