GTAS
GTAS copied to clipboard
Neo4j, ETL Configuration, and Job Scheduler installation instructions
How to install the Neo4j Component for GTAS Link Analysis
https://youtu.be/cDXOYdAVTHc
Configuring the ETL job
sudo mkdir -p /gtas-neo4j-etl/{config,job,log,job/temp}
sudo chown -R gtas-admin /gtas-neo4j-etl
sudo chmod -R 755 /gtas-neo4j-etl
cp /opt/GTAS/gtas-neo4j-etl/job/*.ktr /gtas-neo4j-etl/job
cp /opt/GTAS/gtas-neo4j-etl/job/*.kjb /gtas-neo4j-etl/job
cp -r /opt/GTAS/gtas-neo4j-etl/config/. /gtas-neo4j-etl/config
sudo chown -R gtas-admin /gtas-neo4j-etl/
sudo chmod -R 755 /gtas-neo4j-etl/
Edit gtas-neo4j-config.properties
vim /gtas-neo4j-etl/config/gtas-neo4j-config.properties
Update the config to reflect the values below
...
EXT_VAR_GTAS_DB_USER_NAME=root
EXT_VAR_GTAS_DB_PASSWORD=admin
...
EXT_VAR_NEO4J_DB_USER_NAME=neo4j
EXT_VAR_NEO4J_DB_PASSWORD=admin
...
Install and configure Neo4j
sudo chmod -R u+rw /opt
cd /opt
sudo wget http://dist.neo4j.org/neo4j-community-3.5.3-unix.tar.gz
sudo tar -xzf neo4j-community-3.5.3-unix.tar.gz
sudo chown -R gtas-admin /opt/neo4j-community-3.5.3
sudo chmod -R 755 /opt/neo4j-community-3.5.3
Edit the config file
vim /opt/neo4j-community-3.5.3/conf/neo4j.conf
Update the config to reflect the values below
line 9 dbms.active_database=gtas.db
line 26 dbms.security.auth_enabled=true
line 62 dbms.connectors.default_advertised_address=localhost
line 71 dbms.connector.bolt.listen_address=:7687
line 75 dbms.connector.http.listen_address=:7474
Install Pentaho ETL tool
sudo mkdir -p /opt/pentaho
sudo chown -R gtas-admin /opt/pentaho
sudo chmod -R 755 /opt/pentaho
cd /opt/pentaho
wget https://s3.amazonaws.com/kettle-neo4j/kettle-neo4j-remix-8.2.0.3-519-REMIX.zip
unzip kettle-neo4j-remix-8.2.0.3-519-REMIX.zip -d /opt/pentaho
cp /opt/GTAS/gtas-neo4j-etl/drivers/mariadb-java-client-2.2.1.jar /opt/pentaho/data-integration/lib
sudo chown -R gtas-admin /opt/pentaho
sudo chmod -R 755 /opt/pentaho
cp -r /opt/GTAS/gtas-neo4j-etl/pdi-conf/. ~/
chown -R gtas-admin ~/.pentaho
chmod -R 755 ~/.pentaho
Install the ETL Job Scheduler
cd /opt/GTAS/gtas-neo4j-scheduler
sudo chown -R gtas-admin /opt/GTAS
sudo chmod -R 755 /opt/GTAS
mvn clean install -Dskip.unit.tests=true
cp ./target/gtas-neo4j-job-scheduler-1.jar /gtas-neo4j-etl
chmod 755 /gtas-neo4j-etl/gtas-neo4j-job-scheduler-1.jar
Add required views to the database
log into mariadb
use gtas;
source /opt/GTAS/gtas-neo4j-etl/sql/neo4j_hit_vw.sql
source /opt/GTAS/gtas-neo4j-etl/sql/neo4j_vw.sql
quit
Start Neo4j and ETL Job Scheduler
/opt/neo4j-community-3.5.3/bin/neo4j start
Update the password to 'admin' in Neo4j UI in web browser
cd /gtas-neo4j-etl
java -jar gtas-neo4j-job-scheduler-1.jar
I installed everything successfully but I'm getting, "There are no labels in Database" in Node Labels & "There are no properties in Database" in Property Keys. My Database is running well. Please Help me.
Hi @pradyuman98 . Which branch / release are you using? If you can post a server log for the ETL job, that would be helpful as well. Please double-check that the neo4j_vw table was imported properly into your MariaDB. Thank you--
Hi @pradyuman98 . Which branch / release are you using? If you can post a server log for the ETL job, that would be helpful as well. Please double-check that the neo4j_vw table was imported properly into your MariaDB. Thank you--
Neo4j_vw talble is successfully made but data in it is not popping ?? It's showing "empty set".
Logs: [root@localhost ~]# /opt/neo4j-community-3.5.3/bin/neo4j start
Active database: gtas.db Directories in use: home: /opt/neo4j-community-3.5.3 config: /opt/neo4j-community-3.5.3/conf logs: /opt/neo4j-community-3.5.3/logs plugins: /opt/neo4j-community-3.5.3/plugins import: /opt/neo4j-community-3.5.3/import data: /opt/neo4j-community-3.5.3/data certificates: /opt/neo4j-community-3.5.3/certificates run: /opt/neo4j-community-3.5.3/run Starting Neo4j. WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual. Started neo4j (pid 3523). It is available at http://localhost:7474/ There may be a short delay until the server is ready. See /opt/neo4j-community-3.5.3/logs/neo4j.log for current status. [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# cd /gtas-neo4j-etl [root@localhost gtas-neo4j-etl]# java -jar gtas-neo4j-job-scheduler-1.jar
. ____ _ __ _ _
/\ / ' __ _ () __ __ _ \ \ \
( ( )__ | '_ | '| | ' / ` | \ \ \
\/ )| |)| | | | | || (| | ) ) ) )
' || .__|| ||| |_, | / / / /
=========||==============|/=////
:: Spring Boot :: (v2.0.5.RELEASE)
2020-08-10 23:06:36.368 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : Starting GtasNeo4jJobSchedulerApplication v1 on localhost.localdomain with PID 3615 (/gtas-neo4j-etl/gtas-neo4j-job-scheduler-1.jar started by root in /gtas-neo4j-etl) 2020-08-10 23:06:36.477 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : No active profile set, falling back to default profiles: default 2020-08-10 23:06:38.474 INFO 3615 --- [ main] s.c.a.AnnotationConfigApplicationContext : Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@1996cd68: startup date [Mon Aug 10 23:06:38 EDT 2020]; root of context hierarchy 2020-08-10 23:06:43.826 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : --------SCHEDULER PROPERTIES FROM PROPERTIES FILE ----- 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - execInterval: 60 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - opSystem: linux 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - pdiDir: /opt/pentaho/data-integration/./kitchen.sh 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - jobDir: /gtas-neo4j-etl/job/gtas-to-neo-job.kjb 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - logLevel: Minimal 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - logDir: /gtas-neo4j-etl/log/gtas-neo4j 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - configFilePropertyName: EXT_ETL_CONFIG_FILE 2020-08-10 23:06:43.829 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - configFile: /gtas-neo4j-etl/config/gtas-neo4j-config.properties 2020-08-10 23:06:43.829 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : ---------------------------------- 2020-08-10 23:06:45.301 INFO 3615 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup 2020-08-10 23:06:45.498 INFO 3615 --- [ main] s.a.ScheduledAnnotationBeanPostProcessor : No TaskScheduler/ScheduledExecutorService bean found for scheduled processing 2020-08-10 23:06:45.818 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : Started GtasNeo4jJobSchedulerApplication in 14.517 seconds (JVM running for 19.498) 2020-08-10 23:06:45.823 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : THE GTAS-NEO4J JOB SCHEDULER IS STARTING...... 2020-08-10 23:06:45.657 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.Neo4jScheduledTasks : Starting the thread to execute the PDI job .... 2020-08-10 23:06:45.831 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : COMMAND LINE: /opt/pentaho/data-integration/./kitchen.sh -file='/gtas-neo4j-etl/job/gtas-to-neo-job.kjb' -param:EXT_ETL_CONFIG_FILE='/gtas-neo4j-etl/config/gtas-neo4j-config.properties' -level=Minimal >> /gtas-neo4j-etl/log/gtas-neo4j_20200810.log 2020-08-10 23:06:45.831 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** LAUNCHING PDI ETL JOB FROM SCHEDULER **** 2020-08-10 23:08:19.606 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** END OF ETL JOB FROM SCHEDULER .....EXIT VALUE = 0 2020-08-10 23:09:19.610 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.Neo4jScheduledTasks : Starting the thread to execute the PDI job .... 2020-08-10 23:09:19.611 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : COMMAND LINE: /opt/pentaho/data-integration/./kitchen.sh -file='/gtas-neo4j-etl/job/gtas-to-neo-job.kjb' -param:EXT_ETL_CONFIG_FILE='/gtas-neo4j-etl/config/gtas-neo4j-config.properties' -level=Minimal >> /gtas-neo4j-etl/log/gtas-neo4j_20200810.log 2020-08-10 23:09:19.612 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** LAUNCHING PDI ETL JOB FROM SCHEDULER **** 2020-08-10 23:10:20.348 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** END OF ETL JOB FROM SCHEDULER .....EXIT VALUE = 0 ^Z [1]+ Stopped java -jar gtas-neo4j-job-scheduler-1.jar
Hi @pradyuman98 . Which branch / release are you using? If you can post a server log for the ETL job, that would be helpful as well. Please double-check that the neo4j_vw table was imported properly into your MariaDB. Thank you--
Neo4j_vw talble is successfully made but data in it is not popping ?? It's showing "empty set".
Logs: [root@localhost ~]# /opt/neo4j-community-3.5.3/bin/neo4j start
Active database: gtas.db Directories in use: home: /opt/neo4j-community-3.5.3 config: /opt/neo4j-community-3.5.3/conf logs: /opt/neo4j-community-3.5.3/logs plugins: /opt/neo4j-community-3.5.3/plugins import: /opt/neo4j-community-3.5.3/import data: /opt/neo4j-community-3.5.3/data certificates: /opt/neo4j-community-3.5.3/certificates run: /opt/neo4j-community-3.5.3/run Starting Neo4j. WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual. Started neo4j (pid 3523). It is available at http://localhost:7474/ There may be a short delay until the server is ready. See /opt/neo4j-community-3.5.3/logs/neo4j.log for current status. [root@localhost ~]# [root@localhost ~]# [root@localhost ~]# cd /gtas-neo4j-etl [root@localhost gtas-neo4j-etl]# java -jar gtas-neo4j-job-scheduler-1.jar
. ____ _ __ _ _ /\ / ' __ _ () __ __ _ \ \ \ ( ( )__ | '_ | '| | ' / ` | \ \ \ / )| |)| | | | | || (| | ) ) ) ) ' || .__|| ||| |, | / / / / =========||==============|/=///_/ :: Spring Boot :: (v2.0.5.RELEASE)
2020-08-10 23:06:36.368 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : Starting GtasNeo4jJobSchedulerApplication v1 on localhost.localdomain with PID 3615 (/gtas-neo4j-etl/gtas-neo4j-job-scheduler-1.jar started by root in /gtas-neo4j-etl) 2020-08-10 23:06:36.477 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : No active profile set, falling back to default profiles: default 2020-08-10 23:06:38.474 INFO 3615 --- [ main] s.c.a.AnnotationConfigApplicationContext : Refreshing org.springframework.context.annotation.AnnotationConfigApplicationContext@1996cd68: startup date [Mon Aug 10 23:06:38 EDT 2020]; root of context hierarchy 2020-08-10 23:06:43.826 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : --------SCHEDULER PROPERTIES FROM PROPERTIES FILE ----- 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - execInterval: 60 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - opSystem: linux 2020-08-10 23:06:43.827 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - pdiDir: /opt/pentaho/data-integration/./kitchen.sh 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - jobDir: /gtas-neo4j-etl/job/gtas-to-neo-job.kjb 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - logLevel: Minimal 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - logDir: /gtas-neo4j-etl/log/gtas-neo4j 2020-08-10 23:06:43.828 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - configFilePropertyName: EXT_ETL_CONFIG_FILE 2020-08-10 23:06:43.829 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : - configFile: /gtas-neo4j-etl/config/gtas-neo4j-config.properties 2020-08-10 23:06:43.829 INFO 3615 --- [ main] gov.gtas.scheduler.Neo4jScheduledTasks : ---------------------------------- 2020-08-10 23:06:45.301 INFO 3615 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup 2020-08-10 23:06:45.498 INFO 3615 --- [ main] s.a.ScheduledAnnotationBeanPostProcessor : No TaskScheduler/ScheduledExecutorService bean found for scheduled processing 2020-08-10 23:06:45.818 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : Started GtasNeo4jJobSchedulerApplication in 14.517 seconds (JVM running for 19.498) 2020-08-10 23:06:45.823 INFO 3615 --- [ main] g.g.s.GtasNeo4jJobSchedulerApplication : THE GTAS-NEO4J JOB SCHEDULER IS STARTING...... 2020-08-10 23:06:45.657 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.Neo4jScheduledTasks : Starting the thread to execute the PDI job .... 2020-08-10 23:06:45.831 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : COMMAND LINE: /opt/pentaho/data-integration/./kitchen.sh -file='/gtas-neo4j-etl/job/gtas-to-neo-job.kjb' -param:EXT_ETL_CONFIG_FILE='/gtas-neo4j-etl/config/gtas-neo4j-config.properties' -level=Minimal >> /gtas-neo4j-etl/log/gtas-neo4j_20200810.log 2020-08-10 23:06:45.831 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** LAUNCHING PDI ETL JOB FROM SCHEDULER **** 2020-08-10 23:08:19.606 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** END OF ETL JOB FROM SCHEDULER .....EXIT VALUE = 0 2020-08-10 23:09:19.610 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.Neo4jScheduledTasks : Starting the thread to execute the PDI job .... 2020-08-10 23:09:19.611 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : COMMAND LINE: /opt/pentaho/data-integration/./kitchen.sh -file='/gtas-neo4j-etl/job/gtas-to-neo-job.kjb' -param:EXT_ETL_CONFIG_FILE='/gtas-neo4j-etl/config/gtas-neo4j-config.properties' -level=Minimal >> /gtas-neo4j-etl/log/gtas-neo4j_20200810.log 2020-08-10 23:09:19.612 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** LAUNCHING PDI ETL JOB FROM SCHEDULER **** 2020-08-10 23:10:20.348 INFO 3615 --- [pool-2-thread-1] gov.gtas.scheduler.thread.RunnableTask : *** END OF ETL JOB FROM SCHEDULER .....EXIT VALUE = 0 ^Z [1]+ Stopped java -jar gtas-neo4j-job-scheduler-1.jar
Please Help!!!!!!!