rdf-n3 icon indicating copy to clipboard operation
rdf-n3 copied to clipboard

Serializing large lists is slow

Open ArthurWD opened this issue 5 years ago • 2 comments

Dumping this list in ntriples almost instantly finishes: RDF::List(*(0...100)).dump(:ntriples).

While RDF::List(*(0...100)).dump(:n3) is very slow.

My previous PR #21 improved the performance a bit, but later commits reduced the performance again.

Running Benchmark.measure { RDF::List(*(0...100)).dump(:n3) }.to_s on different commits: 281f707 => 0.477319 0.024164 0.501483 (0.611887) 532485c => 2.693448 0.048810 2.742258 (3.432530) f2938bc => 5.003711 0.026643 5.030354 (6.697221)

The more items in a list, the slower the serialization.

Do you have any ideas on how to improve the performance of large lists?

ArthurWD avatar May 02 '19 14:05 ArthurWD

Well, the writer did get a lot of work to be able to write out full N3 Formulae, vs. just Turtle, but most of that shouldn't have come into play. My suspicion is in this block:

https://github.com/ruby-rdf/rdf-n3/blob/234f7b2304d348cbeebcda36f22feb2407c126e2/lib/rdf/n3/writer.rb#L370-L376

I'll look into it later this week.

gkellogg avatar May 02 '19 15:05 gkellogg

Note that RDF::List(*(0...100)).dump(:ttl) is fairly slow too, but not quite as slow as :n3.

gkellogg avatar May 03 '19 17:05 gkellogg