js_of_ocaml
js_of_ocaml copied to clipboard
Use javascript string by default
In that mode, an OCaml string will be represented as a JavaScript string containing a sequence of bytes. Converting the string to utf16 is still required for interaction with the DOM, or other javascript API.
This current alternative representation of strings (implemented in #923) is not set in stone and could be changed based on benchmarks and usability.
Here are a bunch of things to consider:
- Wrap JavaScript string inside an object (
MlString
) to ease debugging. - Update
string.ml
to make more optimization possible (String.sub
,(^)
, ..) - Add more optimization in jsoo itself https://github.com/ocsigen/js_of_ocaml/pull/977
The PR is useful to check that existing program don't make any assumption about the internal representation of strings.
- [x] Test JaneStreet libraries
- [x] Test jscoq
- [x] Test the jsoo toplevel
- [x] Merge #984
- [ ] Benchmark the two representations
I know this is WIP, but I'm curious. What would be the consequence for users, notably in term of FFI (and performances) ?
This is not really a WIP anymore. It's being tested now.
There should be no implication for FFI. One still need to use existing primitive/stubs to convert between js (utf16) and ocaml (utf8) strings.
I would expect some performances improvement but I didn't run any bench yet.
Hmm, maybe I misunderstood then ! If it's neither for FFI nor perf, what's the goal ? :)
It would be for perf ! We still need to run some bench.
(#655 contains all the logic and I believe does some nice cleanup)
@ejgallego, would you be able to test jscoq against this GPR ?
@ejgallego, would you be able to test jscoq against this GPR ?
Hi @hhugo , sorry I missed this. Yes, we can test soon.
@hhugo I get "caml_jsstring_of_string not implemented", I was able to fix that by adding stuff in mlbytes
and bigarray.js
manually; I got a bit far but still seeing some problems in the de-serialization of objects, likely due to the hacks we have there.
I don't understand why you get "caml_jsstring_of_string not implemented". Have you pin both js_of_ocaml and js_of_ocaml-compiler to this PR ?
I did pin a bunch of libs, let me re-pin everything again, indeed maybe I forgot to pin -compiler
Ok, now I got:
Uncaught TypeError: Cannot read property 'charCodeAt' of undefined
in Coq itself. Will switch to debug mode and research more.
I can help you if you need. Note that marshal on bytes will not work properly.
https://github.com/jscoq/jscoq/blob/v8.11/coq-js/js_stub/str.js#L63 (.c
) is possibly incorrect
Very well spotted indeed (cc: @corwin-of-amber )
Uncaught TypeError: Cannot read property 'charCodeAt' of undefined
at re_match_impl (:8080/coq-js/jscoq_worker.bc.js:7000)
at re_string_match (:8080/coq-js/jscoq_worker.bc.js:7783)
at string_match$0 (:8080/coq-js/jscoq_worker.bc.js:125117)
at bounded_split$0 (:8080/coq-js/jscoq_worker.bc.js:125343)
at split$5 (:8080/coq-js/jscoq_worker.bc.js:125393)
at split_flags (:8080/coq-js/jscoq_worker.bc.js:151618)
at flags_of_string (:8080/coq-js/jscoq_worker.bc.js:151709)
at parse_flags (:8080/coq-js/jscoq_worker.bc.js:151765)
at set_flags (:8080/coq-js/jscoq_worker.bc.js:151783)
at create$31 (:8080/coq-js/jscoq_worker.bc.js:151859)
that should be easy to patch by myself.
With that patched, jsCoq demo file seems to work fine!
Rebased on top of the new Str support.
I share @Drup's curiosity with thrill. Do you have a design doc somewhere that describes the move and its implications ? The JavaScript vs OCaml strings has been an annoying technical point of jsoo when designing OCaml APIs that interact with the browser APIs.
Does it mean that if I have a value of type string
it is represented by a JavaScript string and I can give it to the DOM apis as is ? And how are string literals treated ?
Does it mean that if I have a value of type string it is represented by a JavaScript string and I can give it to the DOM apis as is ? And how are string literals treated ?
In that mode, an ocaml string will be represented as a javascript string containing a sequence of bytes. It would only be safe to pass it to the dom as is for ascii strings. In the general case, you'll still need to convert the string to utf16.
I will update the feature description to explain the design and the plan moving forward.
In short:
- A previous PR https://github.com/ocsigen/js_of_ocaml/pull/923 made changing strings representation easier by cleaning up the runtime
- The alternative string representation is currently a plain js string containing bytes
- The plan is to run multiple bench with different implementations, eventually updating the implementation of
string.ml
and take a decision based on the numbers.
Okay so the dichotomy remains. But I guess given the nature of OCaml strings -- namely a sequence of bytes -- there's not much choice.
one more step, num will be compatible with that change https://github.com/ocaml/num/issues/21
At least #984 have been merged.
Is there any bench somewhere for this?
https://ocsigen.org/js_of_ocaml/latest/manual/performances was updated with some numbers.
Great! Thanks for the pointer!