datahub icon indicating copy to clipboard operation
datahub copied to clipboard

Missing Redash dashboard slug causes ingestion process to fail with JSONDecodeError

Open Eric-Xu opened this issue 2 years ago • 1 comments

Describe the bug When using the current Redash Connector (Datahub v0.8.21), if a dashboard has an empty string as its slug value, the returned response value of the API request will throw a JSONDecodeError, causing the ingestion process to fail.

To Reproduce Steps to reproduce the behavior:

The dashboard slug is used to make the API request. $ curl https://redash.example.com/api/dashboards/<dashboard_slug>?api_key=4JNYxxxxxxxxxxxxxxxxxxxxxxxxxxx

A correct slug value will return a JSON response. $ curl https://redash.example.com/api/dashboards/a_correct_slug_value?api_key=4JNYxxxxxxxxxxxxxxxxxxxxxxxxxxx {"scheduled_queue_name": "scheduled_queries", "name": "presto-prod", "pause_reason": null, "queue_name": "queries", "syntax": "sql", ...}

When the slug value is an empty string, the return response is HTML. $ curl https://redash.example.com/api/dashboards/?api_key=4JNYxxxxxxxxxxxxxxxxxxxxxxxxxxx

<!DOCTYPE html>
<html ng-app="app" ng-strict-di>
  <head lang="en">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta charset="UTF-8">
    <base href="/">
    <title>Redash</title>
    <script src="/static/unsupportedRedirect.js" async></script>

    <link rel="icon" type="image/png" sizes="32x32" href="/static/images/favicon-32x32.png">
    <link rel="icon" type="image/png" sizes="96x96" href="/static/images/favicon-96x96.png">
    <link rel="icon" type="image/png" sizes="16x16" href="/static/images/favicon-16x16.png">
  <link href="/static/vendors~app.22941359f2e6f98e80a1.css" rel="stylesheet"><link href="/static/app.88ac1b6c8e87b2093dc8.css" rel="stylesheet"></head>

  <body ng-class="bodyClass">
    <section>
      <app-view></app-view>
      <div class="loading-indicator">
        <div id="css-logo">
          <div id="circle">
            <div></div>
          </div>
          <div id="point">
            <div></div>
          </div>
          <div id="bars">
            <div class="bar"></div>
            <div class="bar"></div>
            <div class="bar"></div>
            <div class="bar"></div>
          </div>
        </div>
        <div id="shadow"></div>
      </div>
    </section>
  <script type="text/javascript" src="/static/vendors~app.22941359f2e6f98e80a1.js"></script><script type="text/javascript" src="/static/app.88ac1b6c8e87b2093dc8.js"></script></body>
</html>

A JSONDecodeError is thrown when the HTML response is passed to the Redash client module.

---- (full traceback above) ----
File "python3.6/site-packages/datahub/entrypoints.py", line 95, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
File "python3.6/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
File "python3.6/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
File "python3.6/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
File "python3.6/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
File "python3.6/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
File "python3.6/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
File "python3.6/site-packages/datahub/cli/ingest_cli.py", line 74, in run
    pipeline.run()
File "python3.6/site-packages/datahub/ingestion/run/pipeline.py", line 149, in run
    self.source.get_workunits(), 10 if self.preview_mode else None
File "python3.6/site-packages/datahub/ingestion/source/redash.py", line 666, in get_workunits
    yield from self._emit_dashboard_mces()
File "python3.6/site-packages/datahub/ingestion/source/redash.py", line 531, in _emit_dashboard_mces
    dashboard_data = self.client.dashboard(dashboard_slug)
File "python3.6/site-packages/redash_toolbelt/client.py", line 85, in dashboard
    return self._get("api/dashboards/{}".format(slug)).json()
File "python3.6/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
File "python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
File "python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Expected behavior Check if the dashboard slug exists before calling the /api/dashboard/ Redash API endpoint. https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/redash.py#L530

If the slug does not exist, report the dashboard as dropped and continue with the ingestion process.

Eric-Xu avatar Jan 06 '22 03:01 Eric-Xu

Hi @Eric-Xu! Would you be willing to commit this fix back to the community?

maggiehays avatar Jan 06 '22 21:01 maggiehays

Hi @Eric-Xu Is this still an issue? If not, we will close this issue due to inactivity.

siddiquebagwan avatar Sep 01 '22 05:09 siddiquebagwan

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

github-actions[bot] avatar Oct 02 '22 02:10 github-actions[bot]

This issue was closed because it has been inactive for 30 days since being marked as stale.

github-actions[bot] avatar Nov 02 '22 02:11 github-actions[bot]