cumulus icon indicating copy to clipboard operation
cumulus copied to clipboard

CUMULUS-2980 - Fix intermittent test failures from PDRs

Open botanical opened this issue 2 years ago • 1 comments

Summary: Summary of changes

Addresses CUMULUS-2980: Fix intermittent test failures from PDRs

Changes

  • Tested spec/serial/ingestPdrWithNodeNameFtpSpec.js
  • Tested spec/parallel/ingest/ingestPdrWithNodeNameSpec.js
  • Tested spec/parallel/ingest/ingestFromPdrSpec.js
  • Tested ingestFromPdrWithChildWorkflowMetaSpec.js
  • Tested ingestFromPdrWithExecutionNamePrefixSpec.js using the repeat test script.

The issue I was able to reproduce was when there were PDR-dependent data types like granules that failed to get cleaned up, causing problems in the setup. To remedy this, I updated ingestPdrWithNodeNameSpec to use deleteProvidersAndAllDependenciesByHost

PR Checklist

  • [x] Update CHANGELOG
  • [ ] Unit tests
  • [ ] Ad-hoc testing - Deploy changes and test manually
  • [x] Integration tests

botanical avatar Aug 02 '22 15:08 botanical

I think this is right but it looks like there are a couple errors thrown in the cleanup for ingestPdrWithNodeNameSpec in your latest CI run:

API invoke error: /collections/MOD09GQ_test-jtran-int-tf-IngestFromPdrWithNodeName-1659460098103/006 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Record in collections with identifiers {\\\"name\\\":\\\"MOD09GQ_test-jtran-int-tf-IngestFromPdrWithNodeName-1659460098103\\\",\\\"version\\\":\\\"006\\\"} does not exist.\"}

and

API invoke error: /providers/s3_provider_test-jtran-int-tf-IngestFromPdrWithNodeName-1659460098103 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Provider s3_provider_test-jtran-int-tf-IngestFromPdrWithNodeName-1659460098103 not found.

Those may have been issues prior to this change but I'm curious if there's a problem with the cleanup.

https://ci.earthdata.nasa.gov/download/CUM-CBA3254-DIS/build_logs/CUM-CBA3254-DIS-2.log

Taking a closer look at the logs that look similar to this:

{"level":"error","message":"Attempt 1 failed. API invoke error: /collections/MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175/006 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Record in collections with identifiers {\\\"name\\\":\\\"MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175\\\",\\\"version\\\":\\\"006\\\"} does not exist.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:23.685Z"}
{"level":"error","message":"Attempt 1 failed. API invoke error: /providers/s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Provider s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 not found.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:23.718Z"}
{"level":"error","message":"Attempt 2 failed. API invoke error: /collections/MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175/006 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Record in collections with identifiers {\\\"name\\\":\\\"MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175\\\",\\\"version\\\":\\\"006\\\"} does not exist.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:25.152Z"}
{"level":"error","message":"Attempt 2 failed. API invoke error: /providers/s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Provider s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 not found.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:25.177Z"}
{"level":"error","message":"Attempt 3 failed. API invoke error: /collections/MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175/006 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Record in collections with identifiers {\\\"name\\\":\\\"MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175\\\",\\\"version\\\":\\\"006\\\"} does not exist.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:27.635Z"}
{"level":"error","message":"Attempt 3 failed. API invoke error: /providers/s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 returned 404: {\"statusCode\":404,\"error\":\"Not Found\",\"message\":\"Provider s3_provider_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175 not found.\"}.","sender":"@api-client/cumulusApiClient","timestamp":"2022-08-12T22:37:27.744Z"}
Error: Error: /collections/MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175/006 returned 404: {"statusCode":404,"error":"Not Found","message":"Record in collections with identifiers {\"name\":\"MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175\",\"version\":\"006\"} does not exist."}. Failed to get collection {"name":"MOD09GQ_test-jtran-tf-IngestFromPdrWithNodeName-1660343826175","version":"006","dataType":"MOD09GQ","process":"modis","duplicateHandling":"replace","granuleId":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}$","granuleIdExtraction":"(MOD09GQ\\..*)(\\.hdf|\\.cmr|_ndvi\\.jpg)","reportToEms":false,"url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{substring(file.fileName, 0, 3)}","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104.hdf","meta":{"s3MultipartChunksizeMb":16},"files":[{"bucket":"protected","regex":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}\\.hdf$","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104.hdf","url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{extractYear(cmrMetadata.Granule.Temporal.RangeDateTime.BeginningDateTime)}/{substring(file.fileName, 0, 3)}/jtran-tf-IngestFromPdrWithNodeName-1660343826175/"},{"bucket":"private","regex":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}\\.hdf\\.met$","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104.hdf.met","url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{substring(file.fileName, 0, 3)}/jtran-tf-IngestFromPdrWithNodeName-1660343826175/"},{"bucket":"protected-2","regex":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}\\.cmr\\.xml$","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104.cmr.xml","url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{substring(file.fileName, 0, 3)}/jtran-tf-IngestFromPdrWithNodeName-1660343826175/"},{"bucket":"public","regex":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}_ndvi\\.jpg$","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104_ndvi.jpg","url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{substring(file.fileName, 0, 3)}/jtran-tf-IngestFromPdrWithNodeName-1660343826175/"},{"bucket":"private","regex":"^MOD09GQ\\.A[\\d]{7}\\.[\\S]{6}\\.006\\.[\\d]{13}\\.hdf\\.md5$","sampleFileName":"MOD09GQ.A2017025.h21v00.006.2017034065104.hdf.md5","url_path":"{cmrMetadata.Granule.Collection.ShortName}___{cmrMetadata.Granule.Collection.VersionId}/{substring(file.fileName, 0, 3)}/jtran-tf-IngestFromPdrWithNodeName-1660343826175/"}]}

This appears to be the expected behavior. When adding new collections and providers, it checks if they exist(collectionExists, providerExists), and those functions then call the api to get the collection / provider, and tries 3 times before stopping due to the configuration of invokeApi

@npauzenga

botanical avatar Aug 13 '22 00:08 botanical