pathling
pathling copied to clipboard
Pathling on spark cluster hang
Hi! I'm trying to use an Spark Cluster but i could not make it work. This is my docker-compose.yaml
version: "3.8"
services:
pathling:
image: aehrc/pathling
container_name: pathling_server
ports:
- "9090:9090"
environment:
spark.master: spark://spark-master:7077
server.port: 9090
pathling.terminology.enabled: true
spark.executor.memory: 1g
pathling.storage.databaseName: test
# pathling.terminology.verboseLogging: true
pathling.terminology.cache.defaultExpiry: 3600
# pathling.terminology.serverUrl: http://localhost:9191/fhir
pathling.terminology.acceptLanguage: es
JAVA_TOOL_OPTIONS: >
-Xmx8g -XX:MaxMetaspaceSize=400m -XX:ReservedCodeCacheSize=240m -Xss1m
-Duser.timezone=UTC --add-exports=java.base/sun.nio.ch=ALL-UNNAMED
--add-opens=java.base/java.net=ALL-UNNAMED
volumes:
- ./.data:/usr/share/staging
- ./.warehouse:/usr/share/warehouse
spark-master:
image: bitnami/spark:3.4.1
container_name: spark-master
hostname: spark-master
ports:
- "8280:8080" # UI de Spark Master
- "7077:7077" # Puerto de conexión para Workers
environment:
- SPARK_MODE=master
spark-worker-1:
image: bitnami/spark:3.4.1
container_name: spark-worker-1
depends_on:
- spark-master
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark-master:7077
- SPARK_WORKER_MEMORY=8g
ports:
- "8081:8081" # UI de Worker 1
spark-worker-2:
image: bitnami/spark:3.4.1
container_name: spark-worker-2
depends_on:
- spark-master
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark-master:7077
- SPARK_WORKER_MEMORY=8g
ports:
- "8082:8081" # UI de Worker 2
Pahtling logs
15:37:57.148 [main] [] INFO au.csiro.pathling.PathlingServer - Starting PathlingServer using Java 17.0.11 with PID 1 (/app/classes started by root in /)
15:37:57.151 [main] [] INFO au.csiro.pathling.PathlingServer - The following 2 profiles are active: "core", "server"
15:38:02.514 [main] [] WARN o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15:38:05.682 [main] [] WARN o.a.s.s.c.a.SimpleFunctionRegistry - The function date_diff replaced a previously registered function.
15:38:05.892 [main] [] INFO a.c.p.i.CacheableFileSystemPersistence - Querying latest snapshot from database: file:///usr/share/warehouse/test
15:38:06.085 [main] [] INFO au.csiro.pathling.PathlingVersion - Pathling build version: 7.0.1+39bc091
15:38:11.911 [main] [] INFO au.csiro.pathling.fhir.FhirServer - FHIR server initialized
15:38:12.516 [main] [] INFO au.csiro.pathling.PathlingServer - Started PathlingServer in 16.109 seconds (process running for 17.101)
15:38:31.229 [qtp1558397083-95] [rssyWRJu1ppJAYmx] INFO a.c.pathling.update.ImportExecutor - Received $import request
15:38:35.475 [qtp1558397083-95] [rssyWRJu1ppJAYmx] INFO a.c.pathling.update.ImportExecutor - Importing Claim resources (mode: overwrite)
15:38:39.752 [dag-scheduler-event-loop] [] WARN o.a.spark.scheduler.DAGScheduler - Broadcasting large task binary with size 1009.9 KiB
Spark Logs
25/02/03 15:37:33 WARN Master: Got status update for unknown executor app-20250203153727-0001/0
25/02/03 15:37:33 WARN Master: Got status update for unknown executor app-20250203153727-0001/1
25/02/03 15:38:03 INFO Master: Registering app pathling
25/02/03 15:38:03 INFO Master: Registered app pathling with ID app-20250203153803-0002
25/02/03 15:38:03 INFO Master: Start scheduling for app app-20250203153803-0002 with rpId: 0
25/02/03 15:38:03 INFO Master: Launching executor app-20250203153803-0002/0 on worker worker-20250203150828-172.22.0.4-41013
25/02/03 15:38:03 INFO Master: Launching executor app-20250203153803-0002/1 on worker worker-20250203153632-172.22.0.5-37863
25/02/03 15:38:03 INFO Master: Start scheduling for app app-20250203153803-0002 with rpId: 0
25/02/03 15:38:03 INFO Master: Start scheduling for app app-20250203153803-0002 with rpId: 0
25/02/03 15:38:23 WARN Master: Got status update for unknown executor app-20250203153231-0001/0
Nothing happends after made an $import request. How can i get more logs? or are any pathling/spark config that i missing?
Hi @liquid36,
So I think what you are trying to achieve is a Pathling server API that farms out the work to a cluster behind the scenes?
The server image can be used as the server, master and worker - it contains Spark plus the Pathling dependencies. So you probably want a server container (using the aehrc/pathling image), and also a number of workers using the same image but configured to launch Spark in worker mode and point to the server/master container.
It looks like you are trying to do this in Docker Compose. This is totally possible, but if you are open to using Kubernetes (which can be just as easy to set up locally using Minikube or Docker Desktop), there is a pre-made solution for you in the Pathling Helm chart.
The way this works is that you spin up a Pathling server container and you give it permission and tell it how to create its own workers to help it. You can configure in how many you would like, how much resources to give them, etc. The Pathling server will manage the resources within the Kubernetes cluster to make this happen.
Here is an example clustering configuration for use with the Pathling Helm chart.
thanks. I made some tries with Kubernetes but i could not configure properly a custom s3 buckets.
fs.s3a.endpoint: https://us-mia-1.linodeobjects.com
fs.s3a.access.key: 123456789
fs.s3a.secret.key: 123456789
pathling.storage.warehouseUrl: "s3a://"
pathling.storage.databaseName: pathling
I'm getting this error: Caused by: java.lang.IllegalArgumentException: bucket is null/empty
what am i missing?
Do you need the bucket name in the pathling.storage.warehouseUrl variable?
For example, pathling.storage.warehouseUrl: "s3a://mybucket"?
Let us know if you have any more problems, feel free to re-open the ticket.