pyRDF2Vec icon indicating copy to clipboard operation
pyRDF2Vec copied to clipboard

Improve efficiency of sampling strategies

Open GillesVandewiele opened this issue 4 years ago • 0 comments

🚀 Feature

Currently, the sampling techniques are rather slow. The depth-first-search (DFS) algorithm can potentially be improved by making use of smarter data structures and techniques such as caching.

Moreover, a very naive system is currently in place to avoid duplicate walks, but this should be improved as well (a tree should be built of things that are already included in the walk. If all children (neighbors) of a node x are already included in the walks, then x should no longer be visited by the DFS.

GillesVandewiele avatar Nov 02 '20 13:11 GillesVandewiele