orientdb
orientdb copied to clipboard
Load a dataset with more than 32767 labels
OrientDB Version: 3.0.x
Java Version: any
OS: linux
<dependencies>
<!-- https://mvnrepository.com/artifact/com.orientechnologies/orientdb-gremlin -->
<dependency>
<groupId>com.orientechnologies</groupId>
<artifactId>orientdb-gremlin</artifactId>
<version>3.0.30</version>
</dependency>
<dependency>
<groupId>com.orientechnologies</groupId>
<artifactId>orientdb-client</artifactId>
<version>3.0.30</version>
</dependency>
</dependencies>
Task
Load a dataset with more than 32767 different labels (e.g. DBpedia)
Issue
Have a way to load it.
Possibly related
- From issue https://github.com/orientechnologies/orientdb/issues/6577 , set
MINIMUMCLUSTERS
:ALTER DATABASE MINIMUMCLUSTERS 1
- In
OrientGraphFactory.java
there is a method with a promising namesetLabelAsClassName(boolean is)
, but then documentation and implementation talks about prefixes and not actual implementation differences (from 3.0.30):
/**
* Enable or disable the prefixing of class names with V_<label> for vertices or E_<label> for edges.
*
* @param is if true classname equals label, if false classname is prefixed with V_ or E_ (default)
*/
public OrientGraphBaseFactory setLabelAsClassName(boolean is) {
this.labelAsClassName = is;
return this;
}
Could someone tell me if it is at all possible? and if so pointing me in the right direction? Thank you in advance, Martin
Update:
Using ALTER DATABASE MINIMUMCLUSTERS 1
does indeed allow to load datasets with more than 32767 labels.
This conflicts with the statement from @smolinari in #6577 on Aug 19, 2016,
You can change this behavior by running ALTER DATABASE MINIMUMCLUSTERS 1, before you start loading data, which will then give you the full 32676 available classes. This could be at the cost of future performance, however.
Though, there we were discussing about v2.x .
The only references to MINIMUMCLUSTERS
in the v3.x docs says nothing about scalability in terms of labels/classes, they only talk about speeding up insert in a multi threaded environment.
So, you may mark now this issue as documentation request.
I believe, if I recall correctly, that they wanted to increase that limit in 3.0. Not sure though. It's been a while since I've had interest in ODB. 😄
Scott