incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

feat: jupyter playground

Open jochumb opened this issue 1 year ago • 2 comments

This PR is a follow-up on this slack message, referencing this article.

The original goal was to make available the process graph shown in the article, to the community of DevLake users. However, this couldn't be achieved within the current capabilities of DevLake, as it doesn't qualify as a plugin, and it requires graphviz as a container-level dependency and a way to provide the data to Grafana and show it in Grafana. We concluded that the specific visualization would be just one example of a more flexible way to explore the data. This PR is our idea of how this could be achieved, a "data playground" using the strengths of python with pandas in jupyter.

Potential next step

Using this Jupyter playground for data exploration requires a clone of the repo and a local development setup. It is also possible to add a Jupyter server as another (optional) container to the docker-compose set-up. This way the notebooks could be used in a browser.

Not yet included

  • Running the tests in a GitHub action.
  • Dynamic query for issue data for the status transition graph. Currently, it queries all the issues, not a certain scope.
  • A data plotting library. For the status transition graph we use graphviz to plot the graph, and this is a specific requirement for this data structure. And in the template example the data is just printed in a table. We have been using plotly in our Jupyter notebooks, for visualizing data, however, as we didn't include an example yet, we also didn't include the dependency.
  • Postgres support (no dependency on a client or connector has been added.)

jochumb avatar Feb 17 '24 16:02 jochumb

It's a plugin or a feature? How can we use it ? I think some addtional documents are needed.

d4x1 avatar Feb 21 '24 10:02 d4x1

It's a plugin or a feature? How can we use it ? I think some addtional documents are needed.

@d4x1 We updated the PR with a description and also documentation within the change. I hope this clarifies our goals!

jochumb avatar Feb 22 '24 16:02 jochumb

Adding: proposing a change to easily filter on issue type, dates and project key.

lenntt avatar Feb 26 '24 11:02 lenntt

image Added the following (pairing with @jochumb), so that a user can change whether to see the avg, mean, IQR or minmax as default, showing them all in the tooltip.

lenntt avatar Feb 26 '24 13:02 lenntt

Give me some time, and I'll review this PR.

d4x1 avatar Feb 29 '24 10:02 d4x1

  1. Here is a new repo https://github.com/apache/incubator-devlake-playground, it's created for this playground. You can remove codes to this repo. DevLake repo is becoming more and more complicated and playground is an independent part. With a standalone repo, it can be updated conveniently.(For example, devlake is still using Python3.9, which is outdated.)
  2. I haven't run your code locally, but I have reviewed them, you can see the comments on codes.

Thanks for your contribution. I think adding jupyter playground deserves a new blog post in DevLake's official website.

d4x1 avatar Mar 01 '24 11:03 d4x1

  1. Here is a new repo https://github.com/apache/incubator-devlake-playground, it's created for this playground. You can remove codes to this repo. DevLake repo is becoming more and more complicated and playground is an independent part. With a standalone repo, it can be updated conveniently.(For example, devlake is still using Python3.9, which is outdated.)
  2. I haven't run your code locally, but I have reviewed them, you can see the comments on codes.

Thanks for your contribution. I think adding jupyter playground deserves a new blog post in DevLake's official website.

thanks! I think moving to a seperate repository makes sense (need to keep in mind there might be some hard coupling by the data model - when it comes to releasing and versioning perhaps).

The repo is empty I think it needs a first commit because we can't fork it or open a PR: image

lenntt avatar Mar 01 '24 11:03 lenntt

@lenntt https://github.com/apache/incubator-devlake-playground is not empty now.(I haven't noticed that empty repos cannot be forked.)

d4x1 avatar Mar 02 '24 02:03 d4x1

@lenntt https://github.com/apache/incubator-devlake-playground is not empty now.(I haven't noticed that empty repos cannot be forked.) Thanks, https://github.com/apache/incubator-devlake-playground/pull/1 First PR is opened :)

lenntt avatar Mar 04 '24 06:03 lenntt