incubator-hugegraph
incubator-hugegraph copied to clipboard
[Bug] vertex label schema append property key concurrent issue
Bug Type (问题类型)
rest-api (结果不合预期)
Before submit
- [X] 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)
Environment (环境信息)
hugegraph with release-1.0 branch, but I doubt this is not fixed in master branch as well (need to double-check). Mysql is used as backend storage.
Expected & Actual behavior (期望与实际表现)
A very strange situation happens when concurrently appending property key to one vertex label. Some property key ids of 0 appears in the column of the vertex label table. After one whole day's investigation and code review, I found that this is due to weak concurrency control in VertexLabel Update Api.
problematic code
@Override
public VertexLabel append() {
---> // need to add lock here!
---> // we will get the same vertex label object in the memory.
VertexLabel vertexLabel = this.vertexLabelOrNull(this.name);
if (vertexLabel == null) {
throw new NotFoundException("Can't update vertex label '%s' " +
"since it doesn't exist", this.name);
}
this.checkStableVars();
this.checkProperties(Action.APPEND);
this.checkNullableKeys(Action.APPEND);
Userdata.check(this.userdata, Action.APPEND);
for (String key : this.properties) {
PropertyKey propertyKey = this.graph().propertyKey(key);
----> // this will add new property key id to concurrent-unsafe HashSet
vertexLabel.property(propertyKey.id());
}
for (String key : this.nullableKeys) {
PropertyKey propertyKey = this.graph().propertyKey(key);
vertexLabel.nullableKey(propertyKey.id());
}
vertexLabel.userdata(this.userdata);
this.graph().updateVertexLabel(vertexLabel);
return vertexLabel;
}
and then
private void saveSchema(SchemaElement schema, boolean update,
Consumer<SchemaElement> updateCallback) {
// Lock for schema update
LockUtil.Locks locks = new LockUtil.Locks(this.params().name());
try {
locks.lockWrites(LockUtil.hugeType2Group(schema.type()), schema.id());
if (updateCallback != null) {
// NOTE: Do schema update in the lock block
updateCallback.accept(schema);
}
// System schema just put into SystemSchemaStore in memory
if (schema.longId() < 0L) {
this.systemSchemaStore.add(schema);
return;
}
----> // this will then read the cached object with the concurrent-unsafe hash set inside it.
BackendEntry entry = this.serialize(schema);
this.beforeWrite();
if (update) {
this.doUpdateIfPresent(entry);
// TODO: also support updateIfPresent for index-update
this.indexTx.updateNameIndex(schema, false);
} else {
// TODO: support updateIfAbsentProperty (property: label name)
this.doUpdateIfAbsent(entry);
this.indexTx.updateNameIndex(schema, false);
}
this.afterWrite();
} finally {
locks.unlock();
}
}
and then look at MysqlSerializer
@Override
public BackendEntry writeVertexLabel(VertexLabel vertexLabel) {
TableBackendEntry entry = newBackendEntry(vertexLabel);
entry.column(HugeKeys.ID, vertexLabel.id().asLong());
entry.column(HugeKeys.NAME, vertexLabel.name());
entry.column(HugeKeys.ID_STRATEGY, vertexLabel.idStrategy().code());
entry.column(HugeKeys.PROPERTIES,
---> here is issue this.toLongSet(vertexLabel.properties()));
entry.column(HugeKeys.PRIMARY_KEYS,
---> here is issue this.toLongList(vertexLabel.primaryKeys()));
entry.column(HugeKeys.NULLABLE_KEYS,
---> here is issue this.toLongSet(vertexLabel.nullableKeys()));
entry.column(HugeKeys.INDEX_LABELS,
---> here is issue this.toLongSet(vertexLabel.indexLabels()));
this.writeEnableLabelIndex(vertexLabel, entry);
this.writeUserdata(vertexLabel, entry);
entry.column(HugeKeys.STATUS, vertexLabel.status().code());
entry.column(HugeKeys.TTL, vertexLabel.ttl());
entry.column(HugeKeys.TTL_START_TIME,
vertexLabel.ttlStartTime().asLong());
return entry;
}
and then let's look at the toLongSet/to.LongList method. Collection<Id> ids
is actually the cached vertex label's property and may be modified by other threads concurrently, so we cannot use ids.size() and for-loop to copy the items from collection to array! This will lead to the tailed unwanted 0s in the serialized result!
@Override
protected Object toLongList(Collection<Id> ids) {
---> // this is the bug!
long[] values = new long[ids.size()];
int i = 0;
---> // this is the bug!
for (Id id : ids) {
values[i++] = id.asLong();
}
return JsonUtil.toJson(values);
}
My advice is to add a lock in the vertex label api to protect this code zone.
Vertex/Edge example (问题点 / 边数据举例)
No response
Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
No response