marklogic-data-hub
marklogic-data-hub copied to clipboard
[BUG] Information about an error thrown in sourceQuery script is lost. An error in sourceQuery script does not trigger 'stop-on-error'
The issue
Please provide reproducible steps for the issue and a description of what behavior you expected instead.
The steps to reproduce the issue (use as many numbered steps as you need):
- Create any flow with at least 2 steps with "stop-on-error" option set to true.
- Throw any error in sourceQuery script in the first of two declared steps, i.e.
fn.error(xs.QName("mock-ERROR"), "something happened") }
- Run the flow.
What behavior were you expecting?
Firstly, the flow execution should stop at step 1, what is not the case. The first step will have "status" : "failed step 1"
, instead of "canceled step 1".
Secondly, the error trace will not have any information about about QName of the error nor its description. It will only say that status code returned was 400 and that the program was "Unable to collect items to process", i.e. :
[ "java.lang.RuntimeException: Unable to collect items to process for flow TestFlow and step 1; cause: {\"errorResponse\":{\"statusCode\":400, \"status\":\"Unable to collect items to process;
Technical details
Please provide the following version information:
- Operating System = w10
- MarkLogic = 9.0-13.1
- Data Hub = 5.5.0
Thanks @KrystianPlackowski , I'm working on reproducing that now.
I verified that if the sourceQuery throws an error for step 1, then step 2 is still executed.
However, when I run a flow via Gradle, the stepOutput in the JSON response and the Job document in the jobs database both contain the following:
"stepOutput" : [ "java.lang.RuntimeException: Unable to collect items to process for flow customHookFlow and step 1; cause: {\"errorResponse\":{\"statusCode\":400, \"status\":\"Unable to collect items to process; sourceQuery script: cts.uris(null, null, fn.error(xs.QName(\\\"mock-ERROR\\\"), \\\"something happened\\\")); ...
That is consistent with how DHF handles errors when items fail in a step, so I believe that is correct.
How are you running a flow - via Gradle or the client JAR?
Your output is correct, but please notice that error message DHF provides simply copies out the script function. In my case the script looks like this:
"sourceQuery" : "const collectorModule = require(\"/entities/Audit/harmonize/CleanupAudit/collector.xqy\");\ncollectorModule.collect(options);",
So I'm calling an external module here. If an error happens in this external module, DHF provides me 0 information what triggered this error. If it's not a bug, then I require a feature to include the missing error message in that case.
I assume you're using sourceQueryIsScript here? I'm trying the same thing; I'm forcing the sourceQuery to throw an error by doing the following:
"sourceQueryIsScript": true,
"sourceQuery": "require('/data-hub/5/impl/hub-entities.xqy').uberModel('invalid')"
That's calling the uber-model function in DHF's hub-entities.xqy module, which results in an error being thrown since that function doesn't allow for a string parameter. In stepOutput, I get the following:
"stepOutput" : [ "java.lang.RuntimeException: Unable to collect items to process for flow customHookFlow and step 1; cause: {\"errorResponse\":{\"statusCode\":400, \"status\":\"Unable to collect items to process; sourceQuery script: require('/data-hub/5/impl/hub-entities.xqy').uberModel('invalid'); error: \\\"invalid\\\",object-node()\", ...
I tested that on ML 9.0-13 as well. And while that error isn't super-helpful, it's what ML is generating, so DHF is not dropping it anywhere.
So I haven't been able to reproduce the second half of your bug report yet. The first half - stopOnError not working - is being addressed via the internal DHFPROD-7570 ticket and will be fixed in 5.5.2 and 5.6.0.
I tested that on ML 9.0-13 as well. And while that error isn't super-helpful, it's what ML is generating, so DHF is not dropping it anywhere.
Well, in DHF 4 if you put fn.error(xs.QName("mock-ERROR"), "something happened") }
into so called collector plugin (which should be equivalent of source query script in DHF 5) of the flow, then the error is properly rendered:
I'm afraid you are completely wrong if in case of DHF 5 this:
[ "java.lang.RuntimeException: Unable to collect items to process for flow TestFlow and step 1; cause: {\"errorResponse\":{\"statusCode\":400, \"status\":\"Unable to collect items to process;
is "what ML is generating" in your opinion.
Internal support case 32575 investigations outcome;
Second part of the problem report: Javascript exception not being reported
When "sourceQueryIsScript" is set to true, The sourceQuery script is parsed by the collector without any modifications made to it(no cts.uris() applied); https://github.com/marklogic/marklogic-data-hub/blob/v5.5.0/marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/endpoints/collectorLib.sjs#L35 Then, the script would be evaluated through xdmp.eval() call by the collector endpoint https://github.com/marklogic/marklogic-data-hub/blob/v5.5.0/marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/endpoints/collector.sjs#L89 and finally, the harmonizer will take care of mapping the collector output items to their "uri" property; https://github.com/marklogic/marklogic-data-hub/blob/v5.5.0/marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/impl/flow.sjs#L96
The errorResponse is simply reporting the error from the endpoint /v1/internal/hubcollector5 point of view returned HTTP 400 with the server error RESTAPI-SRVEXERR. The endpoint response also provides the souceQuery script and the javascript exception "data" property but it was empty (error:)
The sourceQuery script being evaluated in a try ... catch block by the collector; https://github.com/marklogic/marklogic-data-hub/blob/v5.5.0/marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/endpoints/collector.sjs#L80
In the case of javascript exception occurring during the xdmp.eval() of the "sourceQuey" script, the MarkLogic javascript error RESTAPI-SRVEXERR is thrown through the statement ;
httpUtils.throwBadRequest(`Unable to collect items to process; sourceQuery script: ${javascript}; error: ${err.data}`);
${err.data} would normally report the root exception but in your case you used fn.error(xs.QName("mock-ERROR"), "something happened") without specifying any “data” sequence argument (please refer to fn.error() at https://docs.marklogic.com/fn:error#data.)
Workaround: I suggest adding the "data" argument to fn.error() call to get more information about what error took place at server side when executing the javascript script ;
{
...
"sourceQueryIsScript": true,
"sourceQuery": "cts.uris(null, null, fn.error(xs.QName(\"mock-ERROR\"), \"something happened\",[\"mock-ERROR\",\"something happened\"]))"
}
But as explained above, only “data” propety would be reported in the stepOuput while the exception object has many other properties message, data, name, stack, stackframes, etc. To get an idea about the exception object properties, you could try the following in Qconsole;
try{
xdmp.eval('fn.error(xs.QName(\"mock-ERROR\"), \"something happened\",[\"mock-ERROR\",\"omething happened\"])')
} catch (err) {
err
}
returns
{"stack":"something happened: fn.error(xs.QName(\"mock-ERROR\"), \"something happened\",[\"mock-ERROR\",\"omething happened\"]) -- [\"mock-ERROR\", \"omething happened\"]\nin [anonymous], at 1:3 [javascript]\nin [anonymous], at 3:5 [javascript]\nin /MarkLogic/appservices/qconsole/qconsole-js-amped.sjs, at 42:16 [javascript]\nin /MarkLogic/appservices/qconsole/qconsole-js-amped.sjs, at 30:20 [javascript]\nin /MarkLogic/appservices/qconsole/qconsole-js-amped.sjs, at 32:24 [javascript]\nin /MarkLogic/appservices/qconsole/qconsole-js-amped.sjs, at 44:12, in doEval() [javascript]\nin /qconsole/endpoints/evaljs.sjs, at 337:23 [javascript]\nin xdmp:eval(\"fn.error(xs.QName(\\\"mock-ERROR\\\"), \\\"something happened\\\",[\\\"moc...\") [javascript]\nin /qconsole/endpoints/evaljs.sjs [javascript]\nin /qconsole/endpoints/evaljs.sjs,\nin xdmp:invoke(function bound (), Element(\"<options xmlns='xdmp:eval'><database>18044477627012930808</database>...</options>\")) [javascript]\nin /qconsole/endpoints/evaljs.sjs [javascript]\nin /MarkLogic/appservices/qconsole/qconsole-js-amped.sjs [javascript]\nin /qconsole/endpoints/evaljs.sjs [javascript]", "mlerr":{}, "data":["[\"mock-ERROR\", \"omething happened\"]"], "message":"something happened: fn.error(xs.QName(\"mock-ERROR\"), \"something happened\",[\"mock-ERROR\",\"omething happened\"]) -- ", "name":"something happened", "retryable":false, "stackFrames":[{"uri":"[anonymous]", "line":1, "column":3, "operation":"fn.error(xs.QName(\"mock-ERROR\"), \"something happened\",[\"mock-ERROR\",\"omething happened\"])", "language":"javascript"}, {"uri":"[anonymous]", "line":3, "column":5, "language":"javascript", "isEval":true}, {"uri":"/MarkLogic/appservices/qconsole/qconsole-js-amped.sjs", "line":42, "column":16, "language":"javascript", "isEval":false}, {"uri":"/MarkLogic/appservices/qconsole/qconsole-js-amped.sjs", "line":30, "column":20, "language":"javascript", "isEval":false}, {"uri":"/MarkLogic/appservices/qconsole/qconsole-js-amped.sjs", "line":32, "column":24, "language":"javascript", "isEval":false}, {"uri":"/MarkLogic/appservices/qconsole/qconsole-js-amped.sjs", "line":44, "column":12, "operation":"doEval", "language":"javascript", "isEval":false}, {"uri":"/qconsole/endpoints/evaljs.sjs", "line":337, "column":23, "operation":"xdmp:eval(\"fn.error(xs.QName(\\\"mock-ERROR\\\"), \\\"something happened\\\",[\\\"moc...\")", "language":"javascript", "isEval":false}], "toString":"function toString()"}
I don't see the issue as a bug, however, I think as an enhancement, it would be useful to add some other fields of the javascript exception object to the error message thrown by the collector endpoint (stacktrace, stackframes, etc.); https://github.com/marklogic/marklogic-data-hub/blob/v5.5.0/marklogic-data-hub/src/main/resources/ml-modules/root/data-hub/5/endpoints/collector.sjs#L92 For example, change the exception re-throw statement in the collector to :
httpUtils.throwBadRequest(`Unable to collect items to process; sourceQuery script: ${javascript}; error: ${err.data} , errorStack: ${err.stack}`);
This would give more useful details in my opinion.