foundation icon indicating copy to clipboard operation
foundation copied to clipboard

Improve the performance of filter and lines

Open ndmitchell opened this issue 8 years ago • 6 comments

I ported weeder to foundation. It goes 30% slower compared to [Char], and I was hoping for a significant speedup (that was the entire point of the port). The two most egregious things in the profile are:

39.2   Foundation.String.UTF8 lines (0)
30.8   Foundation.String.UTF8 filter (210)

Namely I spend 39% in lines, and because of #331 I need to add a filter that takes 30% of the time.

ndmitchell avatar Jun 11 '17 07:06 ndmitchell

at the moment there's no way to go faster because it's using the Prelude implementation of lines and filter. relatively easy fix though.

vincenthz avatar Jun 11 '17 08:06 vincenthz

could you try running it through conduit just to check for speed difference ?

Something like this should workaround the slow perf of String.lines:

import Foundation.Conduit.Textual (lines)

runConduit (sinkSource s .| lines .| sinkList)

vincenthz avatar Jun 11 '17 08:06 vincenthz

Conduit goes way faster - it's now beating the [Char] version by 10% or so (which is still pretty crap - but does suggest the final result will be worth doing).

  • The revised benchmark shows lines taking only 5% (almost all of that in Textual.lines - suggesting that the conduit version might be reasonable as the final version...). Either way, copying the approach in Textual.lines is the right thing to do.
  • Filter now takes 48% of the runtime. I wonder if replace "\r" "" might work better? After #314 of course.
  • UTF8.fromBytes is now at 17%, so that's going to get to the bottleneck quite quickly.

ndmitchell avatar Jun 11 '17 09:06 ndmitchell

For info, moving to basement means giving up on the conduit lines function I was using, making this function more important.

ndmitchell avatar Jan 11 '18 10:01 ndmitchell

I think String.lines is now already much faster than it was and not using the compat Prelude.lines. it should really be similar in term of performance to conduit's lines since they are both using the same String.breakLine under the hood.

vincenthz avatar Jan 11 '18 10:01 vincenthz

Then you can probably close this issue (although adding a benchmark might be interesting to see the overhead of conduit).

ndmitchell avatar Jan 11 '18 10:01 ndmitchell