rdflib.js icon indicating copy to clipboard operation
rdflib.js copied to clipboard

Term memoization

Open rescribet opened this issue 6 years ago • 4 comments

As per request I've integrated some of the external changes to rdflib into the library.

This ~~is still a WIP~~ still contains some minor bugs, but the tests succeed and it works properly when integrated into our rdflib app with minimal changes, ~~but some additional code isn't ported yet, which might come this weekend~~. But the overall mechanics and changes are already visible.

The largest change is having to use factory methods rather than constructor calls, IIRC this could be overcome by using ES5 functional types rather than ES6 classes, but I can see value in using factories and leaving the constructor semantics unchanged to prevent unexpected behaviour (e.g. new Literal('a') === new Literal('a') // => true).

TODO:

  • [x] Check if remaining proxy code is still necessary, and port it if true
  • [x] Memoize the literals
  • [x] Performance benchmarks to back-up the claims
  • [ ] (after approval) Update general documentation
  • [ ] (after approval) Look into releasing items from memory for extremely long-running processes

rescribet avatar Nov 09 '18 17:11 rescribet

So, I've created a turtle parsing benchmark, and the results show about a 40% decrease in memory usage (~55% after garbage collection), and an 8% reduction in processing time.

Looking at the difference before and after the gc call, there seems to be some additional space for reducing memory usage.

rescribet avatar Nov 11 '18 22:11 rescribet

Collection still is missing from memoization since, due to its mutable nature, it might require a different strategy. Giving it a store index (should be system index? since the class spans multiple stores) would allow more performant indexedformula indexing (due to the way number indexes are handled vs string indexes)

rescribet avatar Nov 13 '18 10:11 rescribet

Awesome work! Trying to write down what you, @vinnl and I just discussed f2f:

I do find it a bit scary that when you do:

var a = new A();
var b = new A();
a.value = 'x';
b.value ='y';
console.log(a.value);

You would get 'y' instead of 'x'.

So I think that we might be better off going for nodes that are just objects. So instead of:

var node = new NamedNode('http://example.com/friend')

we could maybe just do:

var node  = {
  termType: 'NamedNode',
  value: 'http://example.com/friend'
}

That would also be a lot cheaper for passing these to webworkers and back.

michielbdejong avatar Jul 24 '19 08:07 michielbdejong

Not sure how this PR affects conforming to RDF/JS Data Model. As a note, one of latest changes to that draft provides DataFactory#fromTerm and DataFactory#fromQuad which can also take JSON serializable object and upgrade it to RDF/JS conformant object.

To go other way and 'downgrade' RDF/JS conformant object to JSON serializable object I recall issue opened by @RubenVerborgh https://github.com/rdfjs/data-model-spec/issues/94 which might need revising.

That would also be a lot cheaper for passing these to webworkers and back.

Currently I experiment with prototype based on https://github.com/PolymerLabs/actor-helpers which actually aims at moving as much as possible into web workers. Easy round trip between RDF/JS conformant objects and JSON serializable objects will come there as requirement.

elf-pavlik avatar Jul 24 '19 14:07 elf-pavlik