rio icon indicating copy to clipboard operation
rio copied to clipboard

Async parsing/serializing

Open damooo opened this issue 2 years ago • 2 comments

In web contexts, with most of web frameworks being async, it would help if parsers are async like js counterparts. Currently as parsers doesn't anyway require full content in memory, it should be straight forward to make them async. It will mostly take converting Read => AsyncRead, Write => AsyncWrite, iterators to streams, and making functions async, with out changing any of parsing logic.

But it will be very braking change. But makes parsers non blocking and manifold efficient in concurrent contexts. And can have sync adapters for any synchronous usage.

damooo avatar Aug 10 '22 15:08 damooo

Hi! Thank you for this proposal.

I am not at ease with the idea of making everything async. It makes the code heavier (reading each byte might be an async break point) and requires to choose between Tokio AsyncRead and futures AsyncRead.

I am currently investigating a low level parsing API that would allow to do partial writes in a nice way: the caller would provide a reference to a buffer containing a chunk of the data, the parser would consume it, yield some triple/quad and update its parser state to be ready to receive a new chunk of data (or an EOF signal). This way the parser would not depend on any I/O API and writing adapters for any Read or AsyncRead trait should be quite easy.

What do you think about it?

Tpt avatar Aug 10 '22 16:08 Tpt

Rust async is efficient, and about number of async break points, Async io will be naturally used along with buffered AsyncBufread, just like their sync counter parts BufRead. Other than advantage of non-blocking, remaining things will be quite same. On ecosystem fragmentation, it is good to use futures::io, and there are async-compat wrapper libraries, that can let us use them in tokio contexts with out much issue.

But it would just work, if parsing is decoupled from io as you stated. Then one can chose io mode as it seems fit.

damooo avatar Aug 11 '22 11:08 damooo

I rewrote the Turtle parser from scratch in the main Oxigraph repo. The new parser is now compatible with Tokio I/O traits: https://docs.rs/oxttl/0.1.0-alpha.1/oxttl/turtle/struct.TurtleParser.html#method.parse_tokio_async_read

Tpt avatar Jan 03 '24 17:01 Tpt