N3.js
N3.js copied to clipboard
Is there a need/desire to Extract/Encapsulate the Details of the Three-Layered Index used in N3Store?
So, N3Store
uses a three-layered index to be "lightning fast". I was curious what was involved in how a three-layered index was implemented, and I was thinking it would be useful to be able to use the index without using an N3Store
. (In some of the code I maintain, I simply use an N3Store
instance basically as an index and I don't necessarily need the extra methods not related to index functionality.)
So, I studied the code and was able to isolate a JavaScript class representing the internals of the index. I refactored the N3Store
to use this new index class (I called it CounterBasedGraphsIndex
but maybe a different name like ThreeLayeredRdfJsTermsIndex
would be better/more descriptive). The change moved about 500 lines of index specific code out of the N3Store
. The resulting N3Store
implementation is a little more than 300 lines (excluding the lines for the DatasetCoreAndReadableStream
class embedded in the N3Store.js
module) from over 800.
This refactoring opens up the possibility to easily pick different index implementations that an N3Store
uses. This could be a configuration option to the N3Store
's constructor.
Because the new refactored implementation now delegates most of its work down to the index instance, I was concerned what the performance impact would be.
I ran the performance tests three times before making the changes. Then after refactoring, I ran the tests three more times. There doesn't appear to be any noticeable difference.
Here are the results:
// =============================================================================
// Before:
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 22.659s
* Memory usage for triples: 802MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:38.829 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 13.066s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 13.127s
- Adding 16777216 quads: 32.770s
* Memory usage for quads: 775MB
- Finding all 16777216 quads 1048576 times: 47.811s
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 22.315s
* Memory usage for triples: 802MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:38.796 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 12.979s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 13.213s
- Adding 16777216 quads: 30.257s
* Memory usage for quads: 705MB
- Finding all 16777216 quads 1048576 times: 45.865s
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 21.787s
* Memory usage for triples: 801MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:36.272 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 12.694s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 12.579s
- Adding 16777216 quads: 31.576s
* Memory usage for quads: 705MB
- Finding all 16777216 quads 1048576 times: 46.105s
// =============================================================================
// After:
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 22.664s
* Memory usage for triples: 802MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:40.410 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 13.164s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 13.013s
- Adding 16777216 quads: 30.485s
* Memory usage for quads: 705MB
- Finding all 16777216 quads 1048576 times: 46.852s
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 20.979s
* Memory usage for triples: 801MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:33.853 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 12.510s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 12.535s
- Adding 16777216 quads: 28.797s
* Memory usage for quads: 698MB
- Finding all 16777216 quads 1048576 times: 45.223s
// =============================================================================
N3Store performance test
- Adding 16777216 triples to the default graph: 21.640s
* Memory usage for triples: 802MB
- Finding all 16777216 triples in the default graph 65536 times (0 variables): 1:35.893 (m:ss.mmm)
- Finding all 16777216 triples in the default graph 131072 times (1 variable): 12.650s
- Finding all 16777216 triples in the default graph 196608 times (2 variables): 12.656s
- Adding 16777216 quads: 30.705s
* Memory usage for quads: 809MB
- Finding all 16777216 quads 1048576 times: 45.085s
The tests still pass with 100% coverage.
I'll submit a PR if anyone is interested in this.
Ace, @dhurlburtusa! Please make a PR indeed. Shall we name it N3Index
?