datastudio
datastudio copied to clipboard
Data science, machine learning tools on the cloud
Open data studio
Open data studio is an open initiative to bring machine learning and large scale data processing open-source software to click away for everyone.
Documentation
Please visit open-datastudio.io
Projects
| Component | Project | Description | Integration Status |
|---|---|---|---|
| Notebook | jupyter | Jupyter Lab | Integrated |
| zeppelin | Integrates with Apache Zeppelin and Apache Spark on Kubernetes mode | Integrated | |
| Data Lake | hive-metastore | Provides hive metastore server with Postgresql database | Integrated |
| spark-thriftserver | Spark cluster on Kubernetes for ODBC/JDBC connection | Integrated | |
| Computing | ray-cluster | Ray cluster | Integrated |
| spark-serverless | On-demand Spark cluster from everywhere | Integrated | |
| Machine learning | mlflow-server | MLflow model remote tracking server and ui | Integrated |
| mlflow-model-serving | Deploy models from mlflow-server and get endpoint | Integrated | |
| Business Intelligence | metabase | Metabase Business Intelligence | Integrated |
| superset | Apache Superset Business Intelligence | Integrated | |
| Misc | spark | It does not integrates to Staroid but publishes docker image for other projects | - |
How to contribute?
You can create issues or pull requests to contribute individual repositories under open-datasicence.
If you'd like to create a new integration project here, please create an issue in this repository.
We need your help!
Community
- Open data studio slack channel - Join
License
Open data studio is an open source projects. LICENSE file is included in each repository.