jgit-spark-connector [feature-request] create an engine playground

The best way to get people to try your technology is to reduce time to first "whoa" moment. In order to do so @eiso created a Dockerfile that allows you to run the engine in a very straightforward way. The only problem I see with this is we require people to install docker and download a pretty large image.

I'd like to create an Engine Playground (à la play.golang.org) that will provide shells into an engine instance running on one of our projects.

advantages:

reduced time and friction for users to try our technology
hides complexity of setup until it's necessary
more inclusive to non-devops
can easily track impact visitors/time etc

concerns

abuse: can it be used for bitcoin mining?
security: can it be use to access resources that should be secret?
privacy: could people somehow leave PII we don't want?
others: could people upload illegal content to these servers?
money: we need to pay for this, obviously

Nov 21 '17 13:11 campoy

We're half way there. Right now @rporres is able to do this, and we use them (they are based on the current Dockerfile at the root of this repo). However we manually provision these to people who request them, see here

Since they are provisioned on demand (since we use a new GCP instance per user), we would need a small frontend like Docker.com does with their trial and automated provisioning (should be relatively straight forward to do):

screenshot-2017-11-21-16 45 25

I'd like to create an Engine Playground (à la play.golang.org) that will provide shells into an engine instance running on one of our projects.

Want to make something more fancy or use the Jupyter notebooks?

/cc @marnovo @mcuadros

Nov 21 '17 15:11 eiso

Another alternative is having cached data for a (limited) set of queries, this way, we could 'mock' an engine environment.

Nov 21 '17 15:11 eiso

To learn more about how @src-d/infrastructure does it today: https://github.com/src-d/infrastructure/tree/master/engine-jupyter-demo

Nov 21 '17 17:11 eiso

I doubt the current approach can scale properly... Most likely we would need to use something like Jupyter Hub to manage a multiuser setup...

Nov 21 '17 17:11 rporres

From what I understand about Spark we'd need to also move away from Derby (https://github.com/src-d/engine/issues/192) to allow multiple Spark sessions and a we'd need heavy caching across all sessions.

Nov 21 '17 17:11 eiso

Hosted playground is a nice idea, esp as simple Jupyter UI with Python or Scala can be exposed.

BTW if we are using a container-per-user model, why would multiple Spark sessions be needed?

Nov 22 '17 15:11 bzz

@bzz because of @rporres's comment of how feasible it is to scale with individual instances (not containers) per user.

@rporres since they are temp containers, could we do something like launch a container per user not an instance and have them time-out after x hours of idleness? Just a simple bash script that checks the logs of the container and kills the container if a certain command was not used for x hours?

Nov 22 '17 17:11 eiso

Is this still being worked on or can we close the issue?

Mar 20 '18 15:03 erizocosmico

This will open up again once gitbase is ready, right now it's on pause till performance is where it needs to be.

Mar 21 '18 12:03 eiso

jgit-spark-connector jgit-spark-connector copied to clipboard

[feature-request] create an engine playground

advantages:

concerns

jgit-spark-connector
jgit-spark-connector copied to clipboard