markup.ml icon indicating copy to clipboard operation
markup.ml copied to clipboard

Reverse operation for strings_to_byte

Open xguerin opened this issue 6 years ago • 8 comments

Hello there,

I am trying to use Markup with Cohttp_lwt to do some HTML rewriting. I followed the example to get it to parse, but I am running into some issues to write a Markup stream into a Cohttp body due to the lack of reciprocity of the function strings_to_byte.

Here is what I am trying to to:

  let body = body
             |> Body.to_stream
             |> Markup_lwt.lwt_stream
             |> Markup.strings_to_bytes
             |> Markup.parse_html
             |> Markup.signals
             |> Markup.write_html
             |> Markup_lwt.to_lwt_stream
             |> Body.of_stream

Unfortunately, the last operation is not compatible because to_lwt_stream returns a (char, 'a) stream while of_stream expects a (string, 'a) stream. I could not find a reciprocal version of strings_to_byte to hopefully do the job. Am I looking at the right place ?

Thanks,

xguerin avatar Jan 04 '18 15:01 xguerin

Actually, reading the code it looks like Htlml_writer and Xml_writer already return string streams, and write_html and write_xml transform that stream into a char stream.

EDIT

I locally added write_* functions that return a (string, 'a) stream and ended up with interesting EPIPE errors. As a workaround, I am passing through a string instead of a stream and it works.

xguerin avatar Jan 04 '18 16:01 xguerin

Thanks, I'll take a look. This seems like it would be helped by addressing #10.

aantron avatar Jan 05 '18 02:01 aantron

@xguerin, do you know what was causing the EPIPE signals?

aantron avatar Jan 05 '18 11:01 aantron

Yes, #10 would definitely be the answer (this is basically what I did locally). Regarding EPIPE, I am still investigating. But I think there may be an underlying problem. If I don't use streams but pure strings, at some points I would get truncated/incomplete data (the most obvious consequence of that is that the web browser hangs on loading the page). However, if I skip markup's parsing/writing phase, all is well (so it's not a Lwt/Cohttp issue).

xguerin avatar Jan 05 '18 14:01 xguerin

Would you want to submit your patch as a PR? Otherwise let me know, I'll do it :)

I'm not sure still if you mean the second part is a bug in Markup.ml, nor sure myself if it would be. My guess would be no, because Markup.ml doesn't do its own I/O, but it could be triggering some kind of I/O pattern that it shouldn't trigger.

aantron avatar Jan 05 '18 14:01 aantron

Yes definitely. Re: the other issue, I'll look into it and let you know.

xguerin avatar Jan 05 '18 14:01 xguerin

More info on the choke/EPIPE: I am not able to reproduce it when reading from file and writing to file or reading from a socket (Cohttp_lwt_unix.Client) and writing to file. So it looks like there is something amok with the server socket on the return path. More to come.

xguerin avatar Jan 06 '18 23:01 xguerin

Thanks for continuing to look into it.

aantron avatar Jan 07 '18 01:01 aantron