datahub
datahub copied to clipboard
S3 ingestion: Error when trying to read JSON file
Description: when trying to read JSON file which is on AWS S3 getting validation error. Input JSON which is on S3 bucket: { "id":21, "name": "service", "displayName": "service", "implClass": "org.apache.ran.services.Service", "label": "Service", "description": "service", "resources": "Service" }
Error Log: '. Run with --debug to get full trace\n' '[2022-11-24 14:03:02,843] INFO {datahub.entrypoints:191} - DataHub CLI version: 0.8.42 at ' '/tmp/datahub/ingest/venv-s3-0.8.42/lib/python3.10/site-packages/datahub/init.py\n' '[2022-11-24 14:03:05,082] INFO {datahub.ingestion.run.pipeline:103} - sink wrote workunit ' 'container-urn:li:container:8fd8e279d98b2ef71ef1e9dfa6e7d178-to-urn:li:dataset:(urn:li:dataPlatform:s3,s3-datahub-ingestion/servicedef.json,PROD)\n' '[2022-11-24 14:03:05,182] ERROR {asyncio:1744} - Task exception was never retrieved\n' "future: <Task finished name='Task-2' coro=<retrieve_version_stats() done, defined at " '/tmp/datahub/ingest/venv-s3-0.8.42/lib/python3.10/site-packages/datahub/upgrade/upgrade.py:159> ' "exception=ValidationError(model='VersionStats', errors=[{'loc': ('version',), 'msg': 'none is not an allowed value', 'type': " "'type_error.none.not_allowed'}])>\n" 'Traceback (most recent call last):\n' ' File "/tmp/datahub/ingest/venv-s3-0.8.42/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 184, in ' 'retrieve_version_stats\n' ' current=VersionStats(\n' ' File "pydantic/main.py", line 342, in pydantic.main.BaseModel.init\n' 'pydantic.error_wrappers.ValidationError: 1 validation error for VersionStats\n' 'version\n' ' none is not an allowed value (type=type_error.none.not_allowed)\n', "2022-11-24 14:03:14.873377 [exec_id=2c45b977-ee2e-4ffa-8466-b57c5186207b] INFO: Failed to execute 'datahub ingest'",
@shlatha1990, please can you upgrade your client? You are using a quite old version of it.
Hi @shlatha1990 this seems like a troubleshooting issue, rather than a bug. We're happy to provide community support on our Slack channel, but currently reserve git issues for bugs.
If you're still having trouble, please join us at [slack.datahubproject.io](https://slack.datahubproject.io/) and we can troubleshoot there. For now, I'm going to close this issue.