graphify icon indicating copy to clipboard operation
graphify copied to clipboard

Training on large corpora is extremely slow, anyway to parallelize the pattern detector?

Open jhashemi opened this issue 10 years ago • 5 comments

jhashemi avatar Nov 01 '14 21:11 jhashemi

+1 , the Training is extremely slow

nabilblk avatar Nov 02 '14 14:11 nabilblk

Can you provide memory configurations? Please copy and paste your properties from neo4j.properties in the neo4j /conf directory.

Recommended memory settings are below:

neostore.nodestore.db.mapped_memory=512M neostore.relationshipstore.db.mapped_memory=2048M neostore.propertystore.db.mapped_memory=1024M neostore.propertystore.db.strings.mapped_memory=500M neostore.propertystore.db.arrays.mapped_memory=500M

This configuration assumes you have at least 8GB of available system memory.

kbastani avatar Nov 03 '14 03:11 kbastani

Definitely helped training, but now classification takes upwards of 3+minutes per entity. This is using a HA cluster

jhashemi avatar Nov 17 '14 06:11 jhashemi

Glad to hear it helped training. I'm going to need more information about your dataset in order to get you fixed up. You can reach me on Skype at kenny.bastani or e-mail [email protected].

kbastani avatar Nov 18 '14 07:11 kbastani

Using the recommended memory settings above certain improves the training speed(most requests are sub-second). Classification requests take anywhere between 15 to 30 seconds. Any way to speed them up ? Also, if multiple classify requests are sent in parallel, it throws a 500.

letronje avatar Jan 25 '15 11:01 letronje