github-explorer icon indicating copy to clipboard operation
github-explorer copied to clipboard

GitHub issues burndown chart & survival analysis

Open stas-sl opened this issue 2 years ago • 5 comments

Hello, thanks for providing this dataset!

Not sure if this is the right place to post, but I used it my Observable notebook to analyze how long do issues "live" and to create some "burndown" charts.

After playing a bit with Observable + ClickHouse, I found it to be a great combo for performing and sharing such explorations.

Here are some charts for ClickHouse repo:

image image

stas-sl avatar Apr 17 '22 00:04 stas-sl

Thank you! The charts are amazing and the showcase of using Observable is what we need! PS. It's sad that I often miss notifications from GitHub, so I've noticed this issue only today.

alexey-milovidov avatar May 08 '22 20:05 alexey-milovidov

FYI created issue after reading previous Observable notebook: https://github.com/ClickHouse/ClickHouse/issues/37024

alexey-milovidov avatar May 08 '22 22:05 alexey-milovidov

Great, thanks! I was wondering why it wasn't working.

stas-sl avatar May 08 '22 22:05 stas-sl

@stas-sl - This is super amazing.

If you are interested, I'd love to hear a bit more about your experience. Would also be super excited if you are interested in writing a blogpost about the integration on clickhouse.com !

tylerhannan avatar May 09 '22 10:05 tylerhannan

Thanks, @tylerhannan!

I would say my experience was pretty enjoyable, I liked working with Observable and ClickHouse, they are both great and promising products. I came to Observable having previously working with Jupyter Notebooks, so it took some time to adapt to its reactivity model, when cells are not executed from top to bottom, but instead there is a dependency graph of cells, and whenever a parent cell changes, all descendants are re-evaluated.

Of course there are obvious cons and pros comparing to Jupyter Notebooks. The cons are that you don't have access to large number of python data-science packages like pandas/numpy/scikit-learn. The pros are that you don't need a running server, you can start exploring/prototyping in your browser with just a few clicks, and it is much easier to add interactivity to your notebooks. Also I like that it is easy to share your work and explore others, or reuse what others have done by forking/importing their notebooks.

Considering integration with CH, I created a basic client wrapper over HTTP interface, which allowed to create SQL cells inside Observable notebooks. I've tried to use existing Nodejs libraries like this one, but I was unable to make them work in browser. If you aware of other JS client libraries that could work there, I would be interested to know.

Also they have out of the box support for some common databases like Postgres, MySQL, SQLite, BigQuery, Snowflake. I believe, if there would be a good JS client and enough demand/motivation, they could easily add ClickHouse to this list. That would be great!

One issue, I faced, that should be mentioned, as it is quite important, is caching results in public/shared notebooks. The immediate downside of reactivity model of this kind, is that whenever you are opening a notebook, all its cells are re-evaluated including those that querying the database, so if you have many queries, it could cause a mini DDoS attack on your server, so you have to be cautious with that. I've used an ad-hoc workaround to cache query results in file attachments, but being able to switch to live via checkbox. While it solved the issue, it required more manual work, than I'd like. These are known issues (https://github.com/observablehq/feedback/issues/381, https://github.com/observablehq/feedback/issues/58, https://github.com/observablehq/feedback/issues/175) and I hope, they will be addressed in the future.

stas-sl avatar May 09 '22 13:05 stas-sl