SynapseML
SynapseML copied to clipboard
setting useSingleDatasetMode to True gives : java: malloc.c:4033: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
setting useSingleDatasetMode
to True
gives
java: malloc.c:4033: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)' failed.
I keep everything else the same and set that flag to False
and the code runs just fine. At the surface the error looks like some memory overflow but, the single dataset mode was supposed to reduce the memory burden among other things.
Could you please help me understand what is going wrong?
I am using 0.9.4 synapse. My spark version is 3.1.2
A few more details that might help you:
cluster: 1 master, 2 executors - 64 GB, 16 cores each training data on disk - 1.37 GB
spark-submi config:
spark.executor.memory=10g spark.executor.instances=10 spark.executor.cores=3 spark.driver.memory=10g spark.default.parallelism=54 spark.driver.cores=3 spark.driver.memoryOverhead=1024m spark.executor.memoryOverhead=1024m spark.dynamicAllocation.enabled=false
model parameters:
{
"numIterations":2500,
"learningRate":0.01,
"maxDepth":30,
"earlyStoppingRound":50,
"chunkSize":800000,
"parallelism":"voting_parallel",
"useSingleDatasetMode":true,
"numThreads":14
}
@imatiach-msft for visibility
@Nitinsiwach I wonder if you should instead try setting:
spark.executor.instances=32 (2*16 cores?) spark.executor.cores=1 spark.executor.memory=4 (64 GB per machine/16 executors per machine?)
You can also try the opposite extreme:
spark.executor.instances=2 spark.executor.cores=16 spark.executor.memory=64g
I'm not sure about the second case, but the first one looks more similar to databricks I believe, which I've tested on the most.