paraflow
paraflow copied to clipboard
A real-time analytical system for ID-associated data
ParaFlow
ParaFlow is an interactive analysis system for OLAP developed at DBIIR Lab @ RUC.
Install & Deploy
Hadoop
Hadoop file system is required.
Zookeeper-3.4.13
This is required by Kafka. what need to deploy is simply config the cluster ip and port;
Kafka-2.11_1.11
Postgresql-9.5
Presto-0.192
Paraflow
-
MetaServer(one node)
-
Loader [cn.edu.ruc.iir.paraflow.example.loader.BasicLoader]
config the ./paraflow-loader.sh then:
./sbin/paraflow-loader.sh deploy
-
Collector [cn.edu.ruc.iir.paraflow.example.loader.BasicCollector]
config the ./paraflow-collector.sh then:
./sbin/paraflow-collector.sh deploy
-
Presto connector
Configuration
Initialization
- Create user and database in pg for metadata.
CREATE USER paraflow WITH PASSWORD 'paraflow'
;
CREATE DATABASE paraflowmeta
;
GRANT ALL ON DATABASE paraflowmeta TO paraflow
.
Startup
- Start Zookeeper cluster
- Start Kafka
- Start PostgreSql
- Start Paraflow MetaServer
./bin/paraflow-metaserver-start.sh [-daemon]
- Start Paraflow Loader
./sbin/paraflow-loader.sh start
- Start Paraflow Collector
./sbin/paraflow-collector.sh start
- Start Presto cluster or single node to execute queries;