ocaml-cohttp
ocaml-cohttp copied to clipboard
simple parsing/printing of requests/responses
I would like to parse simple test data. For example, here's a file with a request:
POST / http/1.1
Content-Type:application/x-www-form-urlencoded; charset=utf8
Date:Mon, 09 Sep 2011 23:36:00 GMT
Host:host.foo.com
foo=bar
I'd like a function like this:
val of_file : filename -> (Request.t * Body.t)
where Body.t = string. The closest I've managed to figure out is Cohttp.Request.Make(Cohttp.String_io.M), but AFAICT I can't get the body from any of the resulting functions.
I'll also want a similar function to parse responses.
All this is provided for Lwt and Async, but I don't see solutions for the simpler case of data in plain strings.
I think that function only deals with the header, but I'm still having trouble getting the body. After instantiating the functor I mentioned, has_body just returns ``Unknown`.
Hi, @agarwal
I wrote a short piece of code and it works well.
(** ocamlbuild -use-ocamlfind -pkg cohttp simple_parsing.native *)
open Cohttp
module type S = sig
val parse : string -> (Request.t * Body.t)
end
module StringParse : S = struct
module Req = Cohttp.Request.Make(Cohttp.String_io.M)
let parse str =
let open String_io.M in
let ic = String_io.open_in str in
Req.read ic >>= fun result ->
match result with
| `Ok req -> begin
let reader = Req.make_body_reader req ic in
let rec loop acc =
Req.read_body_chunk reader >>= (fun result ->
match result with
| Transfer.Chunk str -> loop (str :: acc)
| Transfer.Final_chunk str -> str :: acc
| Transfer.Done -> acc) in
let body = loop [] |> Body.of_string_list in
req, body end
| `Invalid error -> assert false
| `Eof -> assert false
end
let str = "GET / HTTP/1.1\r\nhost: example.com\r\ncontent-length:3\r\n\r\n123"
let () = ignore (StringParse.parse str)
A pull req for this would be great!
On 24 Sep 2015, at 22:20, Runhang (Mark) Li [email protected] wrote:
Hi, @agarwal
I wrote a short piece of code and it works well.
(** ocamlbuild -use-ocamlfind -pkg cohttp simple_parsing.native *) open Cohttp
module type S = sig val parse : string -> (Request.t * Body.t) end
module StringParse : S = struct
module Req = Cohttp.Request.Make(Cohttp.String_io.M)
let parse str = let open String_io.M in let ic = String_io.open_in str in Req.read ic >>= fun result -> match result with |
Ok req -> begin let reader = Req.make_body_reader req ic in let rec loop acc = Req.read_body_chunk reader >>= (fun result -> match result with | Transfer.Chunk str -> loop (str :: acc) | Transfer.Final_chunk str -> str :: acc | Transfer.Done -> acc) in let body = loop [] |> Body.of_string_list in req, body end |Invalid error -> assert false | `Eof -> assert falseend
let str = "GET / HTTP/1.1\r\nhost: example.com\r\ncontent-length:3\r\n\r\n123"
let () = ignore (StringParse.parse str) — Reply to this email directly or view it on GitHub.
@rgrinberg what do you think if I'd like to contribute some blocking I/O code like /async and /lwt, /lwt-core? Should I create an another directory called /block_io?
@marklrh Thanks. I was doing at least two things wrong. Still unclear why has_body returns ``Unknown`, but I don't actually need that right now.
It would probably help beginners to have a blocking version of cohttp. I've recently re-organized future and biocaml to support this kind of thing. Each sub-directory under my lib/ directories provides a separate library with different dependencies.
Should I create an another directory called /block_io?
FYI, here's some internal documentation I follow for choosing names in this context. (I'm not confident that I've got this right. I welcome comments).
We provide a separate library for each of several architectures, which are defined by two parameters:
- Purity: Whether or not the library makes Unix calls or has bindings to C code. If the library does either of these, we label it
unix. If it does neither, we label itpure. - Concurrency: If the library uses
lwtorasync, we label it as such. If it makes blocking calls, we don't assign it any label, i.e. the label is the empty string (because this case is already covered by theunixlabel of the purity criteria).
Thus, there are 6 combinations possible: pure, lwt-pure, async-pure, unix, async-unix, and lwt-unix.
A content-length header appears to be required. Is that correct? Without that, as in my original example, I always get an End_of_file error.
@marklrh Would this blocking IO depend on unix?
IIRC @seliopou worked on this before.
Let me recall what this would require.
I feel like it will not depend on Unix. We will use string_io module to read and write.
/cc @kayceesrk who was also interested in building an experimental effects-based implementation as well.
@agarwal I believe so. See HTTP RFC ch14
@avsm @kayceesrk I would love to help with effects-based impl. it looks interesting
@marklrh Section 4.4 seems to be the most relevant, and AFAICT it is not absolutely required. Too bad all these specs are informally written.
I guess the question is if Cohttp makes available lower level parsing that does less checks? In Biocaml, I often structure parsers with types t0, t1, ..., t, where t0 is the least parsed, maybe just a string, t1 parses a bit more, and so on, until t, a type that enforces absolutely every requirement. This firstly provides an escape hatch in case of non-compliant data (very common in bioinformatics), and also aids in efficiency since you can selectively do less parsing when you don't need the richer types.
@marklrh then I think block_io is a bit of misleading name. If it isn't using unix how is it blocking? Something like string_io or string
In any case, let's see a PR and we can kvetch about the name there.
@marklrh Would be great if you could help! Let's take the discussion off-thread.