rdf-n3
rdf-n3 copied to clipboard
Serializing large lists is slow
Dumping this list in ntriples almost instantly finishes: RDF::List(*(0...100)).dump(:ntriples)
.
While RDF::List(*(0...100)).dump(:n3)
is very slow.
My previous PR #21 improved the performance a bit, but later commits reduced the performance again.
Running Benchmark.measure { RDF::List(*(0...100)).dump(:n3) }.to_s
on different commits:
281f707 => 0.477319 0.024164 0.501483 (0.611887)
532485c => 2.693448 0.048810 2.742258 (3.432530)
f2938bc => 5.003711 0.026643 5.030354 (6.697221)
The more items in a list, the slower the serialization.
Do you have any ideas on how to improve the performance of large lists?
Well, the writer did get a lot of work to be able to write out full N3 Formulae, vs. just Turtle, but most of that shouldn't have come into play. My suspicion is in this block:
https://github.com/ruby-rdf/rdf-n3/blob/234f7b2304d348cbeebcda36f22feb2407c126e2/lib/rdf/n3/writer.rb#L370-L376
I'll look into it later this week.
Note that RDF::List(*(0...100)).dump(:ttl)
is fairly slow too, but not quite as slow as :n3
.