sedlex icon indicating copy to clipboard operation
sedlex copied to clipboard

Buffer channel input, return None at the end of partial reads.

Open toots opened this issue 5 years ago • 7 comments

Finally came up with the proper fix for #45 !

This PR adds proper buffering on channel reads while also inserting a None entry at the end of each partial read. This allow the fill_buf_from_gen function to return without reaching blocking read on the buffer.

toots avatar Apr 30 '19 08:04 toots

It is probably only necessary to be careful about blocking on interactive uses; for non-interactive uses it probably slows things down a bit (and that can be significant when compiling a large code base; often lexing is one of the most time-consuming steps). I wonder if we should have two separate functions for this. Several other lexical analyzer generators have specific options for interactive vs. non-interactive use. Perhaps @Drup has an opinion.

pmetzger avatar Apr 30 '19 13:04 pmetzger

A separate function could work too. I'm already doing that in the app itself.

toots avatar Apr 30 '19 13:04 toots

@pmetzger I'm not so convinced that having two modes is important. We would need benchmarks to really make a decision.

Drup avatar May 01 '19 13:05 Drup

@Drup A benchmark is probably a good idea.

pmetzger avatar May 01 '19 23:05 pmetzger

@toots I noticed this one was still open. What should we do about it?

pmetzger avatar Jul 23 '19 15:07 pmetzger

Thanks for bringing this one up!

We have used this function in our application for a while and it's working pretty well. I say we could either merge it as it is or, if we can to be conservative with features, add those changes as a from_interactive_channel.

toots avatar Jul 23 '19 16:07 toots

Do you think you could do a quick benchmark to see if we should bother with a distinct mode?

pmetzger avatar Jul 26 '19 21:07 pmetzger

Running a quick benchmark, it seems that this PR is actually faster than the implementation on master, probably due to the fact that we perform less c calls.

That said, I cannot convince myself that the implementation is correct. In particular, I don't understand how we can be sure that we don't return None at the very start of the refill call (which would look like an EOF)

hhugo avatar Feb 20 '23 10:02 hhugo

Also, I think we would have invalid results if a give-back-control None was returned in the middle of Utf8.Helper.from_gen.

hhugo avatar Feb 20 '23 11:02 hhugo

I have an alternative fix for #45, see #124

hhugo avatar Feb 20 '23 12:02 hhugo

This can be closed @toots

hhugo avatar Feb 22 '23 17:02 hhugo

Indeed! @hhugo do you have other pending changes? I'd say we should do a release soon now.

toots avatar Feb 23 '23 15:02 toots

I don't have other changes planned. I wanted to run some more tests with jsoo first maybe.

hhugo avatar Feb 23 '23 15:02 hhugo

@toots, I run tests with jsoo, all good. I've just open one last fix #126

hhugo avatar Feb 24 '23 00:02 hhugo