corese icon indicating copy to clipboard operation
corese copied to clipboard

Inference for Large Scale Data

Open rac021 opened this issue 8 years ago • 2 comments

Hi,

I'm using corese reasoner to infere data, my problem is that I have a large Turtle file, so I'm expecting memory errors, to deal with this, I splited my large file into several files of reasonable size.

I intend to launch the reasoner on each of these files.

My question is : After the inference, would I have the same final result as if I had one large Turtle file ?

If no, what is in this case the best strategy to launch inference on a large data ?

Thank's

R

rac021 avatar Jan 06 '17 18:01 rac021

Hi,

On 01/06/2017 07:23 PM, Yahiaoui Rachid wrote:

I'm using corese reasoner to infere data, my problem is that I have a large Turtle file, so I'm expecting memory errors, to deal with this, I splited my large file into several files of reasonable size.

I intend to launch the reasoner on each of these files.

My question is : After the inference, would I have the same final result as if I had one large Turtle file ?

Yes if you load all the files in the same session, no otherwise.

If no, what is in this case the best strategy to launch inference on a large data ?

What is the size of the file ?

Best regards,

Olivier

ocorby avatar Jan 09 '17 08:01 ocorby

On tests I'm on files of ~ 30 GB, but I have to expect much more than that... as I work on large volumes of data

Knowing that I have no owl:TransitiveProperty in the Ontology, should I still load all the files in the same session ? ( I'm loading the Ontology with each file when the reasoner is running )

Thank's. R

rac021 avatar Jan 09 '17 08:01 rac021