datahub icon indicating copy to clipboard operation
datahub copied to clipboard

NullPointerException when attempting to deploy Ingestion Source with latest release

Open agapebondservant opened this issue 1 year ago • 0 comments

Describe the bug I get the following error when I attempt to create an S3 ingestion source (pulled the latest helm chart as of today -8/20 ). I was able to replicate this with Postgres as well.

2024-08-20 22:02:04,799 [ForkJoinPool.commonPool-worker-11] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:45 - Failed to execute
java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to perform update against ingestion source with urn { name: "CIFAR Data Source", type: "s3", schedule: { interval: "0 0 * * *", timezone: "America/New_York" }, config: { recipe: "{"source":{"type":"s3","config":{"path_specs":[{"include":"s3://my-resource/my.parquet"}],"env":"PROD","aws_config":{"aws_region":"us-east-2","aws_access_key_id":"HIDDEN","aws_secret_access_key":"HIDDEN"},"profiling":{"enabled":false}}}}", executorId: "default", debugMode: false, extraArgs: [] } }
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1770)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1760)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)
Caused by: java.lang.RuntimeException: Failed to perform update against ingestion source with urn { name: "CIFAR Data Source", type: "s3", schedule: { interval: "0 0 * * *", timezone: "America/New_York" }, config: { recipe: "{"source":{"type":"s3","config":{"path_specs":[{"include":"s3://my-resource/my.parquet"}],"env":"PROD","aws_config":{"aws_region":"us-east-2","aws_access_key_id":"HIDDEN","aws_secret_access_key":"HIDDEN"},"profiling":{"enabled":false}}}}", executorId: "default", debugMode: false, extraArgs: [] } }
	at com.linkedin.datahub.graphql.resolvers.ingest.source.UpsertIngestionSourceResolver.lambda$get$0(UpsertIngestionSourceResolver.java:93)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
	... 6 common frames omitted
Caused by: java.lang.NullPointerException: null algorithm name
	at java.base/java.util.Objects.requireNonNull(Objects.java:235)
	at java.base/java.security.MessageDigest.getInstance(MessageDigest.java:182)
	at com.linkedin.metadata.systemmetadata.ElasticSearchSystemMetadataService.toDocId(ElasticSearchSystemMetadataService.java:92)
	at com.linkedin.metadata.systemmetadata.ElasticSearchSystemMetadataService.insert(ElasticSearchSystemMetadataService.java:139)
	at com.linkedin.metadata.service.UpdateIndicesService.updateSystemMetadata(UpdateIndicesService.java:620)
	at com.linkedin.metadata.service.UpdateIndicesService.handleUpdateChangeEvent(UpdateIndicesService.java:186)
	at com.linkedin.metadata.service.UpdateIndicesService.handleChangeEvent(UpdateIndicesService.java:143)
	at com.linkedin.metadata.entity.EntityServiceImpl.preprocessEvent(EntityServiceImpl.java:1382)
	at com.linkedin.metadata.entity.EntityServiceImpl.alwaysProduceMCLAsync(EntityServiceImpl.java:1725)
	at com.linkedin.metadata.entity.EntityServiceImpl.conditionallyProduceMCLAsync(EntityServiceImpl.java:1785)
	at com.linkedin.metadata.entity.EntityServiceImpl.conditionallyProduceMCLAsync(EntityServiceImpl.java:1800)
	at com.linkedin.metadata.entity.EntityServiceImpl.lambda$emitMCL$42(EntityServiceImpl.java:1036)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
	at com.linkedin.metadata.entity.EntityServiceImpl.emitMCL(EntityServiceImpl.java:1037)
	at com.linkedin.metadata.entity.EntityServiceImpl.ingestAspects(EntityServiceImpl.java:777)
	at com.linkedin.metadata.entity.EntityServiceImpl.ingestProposalSync(EntityServiceImpl.java:1329)
	at com.linkedin.metadata.entity.EntityServiceImpl.ingestProposal(EntityServiceImpl.java:1163)
	at com.linkedin.metadata.client.JavaEntityClient.batchIngestProposals(JavaEntityClient.java:769)
	at com.linkedin.entity.client.EntityClient.ingestProposal(EntityClient.java:531)
	at com.linkedin.datahub.graphql.resolvers.ingest.source.UpsertIngestionSourceResolver.lambda$get$0(UpsertIngestionSourceResolver.java:90)
	... 7 common frames omitted
2024-08-20 22:02:04,800 [ForkJoinPool.commonPool-worker-10] ERROR c.datahub.graphql.GraphQLController:147 - Errors while executing query: mutation createIngestionSource($input: UpdateIngestionSourceInput!) {
  createIngestionSource(input: $input)
}
, result: {errors=[{message=An unknown error occurred., locations=[{line=2, column=3}], path=[createIngestionSource], extensions={code=500, type=SERVER_ERROR, classification=DataFetchingException}}], data={createIngestionSource=null}, extensions={tracing={version=1, startTime=2024-08-20T22:02:04.769098622Z, endTime=2024-08-20T22:02:04.800083738Z, duration=31045574, parsing={startOffset=962276, duration=781999}, validation={startOffset=1483114, duration=467405}, execution={resolvers=[{path=[createIngestionSource], parentType=Mutation, returnType=String, fieldName=createIngestionSource, startOffset=2148329, duration=28062247}]}}}}, errors: [DataHubGraphQLError{path=[createIngestionSource], code=SERVER_ERROR, locations=[SourceLocation{line=2, column=3}]}]

To Reproduce Steps to reproduce the behavior:

  1. Click on 'Ingestion -> Create new source'
  2. Click on 'Other'
  3. Enter the following configuration:
source:
    type: s3
    config:
        path_specs:
            - include: "s3://my-resource/my.parquet"
        env: PROD
        aws_config:
            aws_region: us-east-2
            aws_access_key_id: HIDDEN
            aws_secret_access_key: HIDDEN
        profiling:
            enabled: false
  1. See error

Expected behavior I should be able to create the source without errors.

Screenshots image

Desktop (please complete the following information):

  • OS: IoS
  • Browser Chrome

Additional context None

agapebondservant avatar Aug 20 '24 22:08 agapebondservant