js_of_ocaml icon indicating copy to clipboard operation
js_of_ocaml copied to clipboard

Use javascript string by default

Open hhugo opened this issue 4 years ago • 21 comments

In that mode, an OCaml string will be represented as a JavaScript string containing a sequence of bytes. Converting the string to utf16 is still required for interaction with the DOM, or other javascript API.

This current alternative representation of strings (implemented in #923) is not set in stone and could be changed based on benchmarks and usability.

Here are a bunch of things to consider:

  • Wrap JavaScript string inside an object (MlString) to ease debugging.
  • Update string.ml to make more optimization possible (String.sub, (^), ..)
  • Add more optimization in jsoo itself https://github.com/ocsigen/js_of_ocaml/pull/977

The PR is useful to check that existing program don't make any assumption about the internal representation of strings.

  • [x] Test JaneStreet libraries
  • [x] Test jscoq
  • [x] Test the jsoo toplevel
  • [x] Merge #984
  • [ ] Benchmark the two representations

hhugo avatar Apr 03 '20 20:04 hhugo

I know this is WIP, but I'm curious. What would be the consequence for users, notably in term of FFI (and performances) ?

Drup avatar Apr 10 '20 16:04 Drup

This is not really a WIP anymore. It's being tested now.

There should be no implication for FFI. One still need to use existing primitive/stubs to convert between js (utf16) and ocaml (utf8) strings.

I would expect some performances improvement but I didn't run any bench yet.

hhugo avatar Apr 10 '20 16:04 hhugo

Hmm, maybe I misunderstood then ! If it's neither for FFI nor perf, what's the goal ? :)

Drup avatar Apr 10 '20 17:04 Drup

It would be for perf ! We still need to run some bench.

hhugo avatar Apr 10 '20 17:04 hhugo

(#655 contains all the logic and I believe does some nice cleanup)

hhugo avatar Apr 10 '20 17:04 hhugo

@ejgallego, would you be able to test jscoq against this GPR ?

hhugo avatar Apr 18 '20 17:04 hhugo

@ejgallego, would you be able to test jscoq against this GPR ?

Hi @hhugo , sorry I missed this. Yes, we can test soon.

ejgallego avatar Apr 22 '20 23:04 ejgallego

@hhugo I get "caml_jsstring_of_string not implemented", I was able to fix that by adding stuff in mlbytes and bigarray.js manually; I got a bit far but still seeing some problems in the de-serialization of objects, likely due to the hacks we have there.

ejgallego avatar Apr 23 '20 15:04 ejgallego

I don't understand why you get "caml_jsstring_of_string not implemented". Have you pin both js_of_ocaml and js_of_ocaml-compiler to this PR ?

hhugo avatar Apr 23 '20 15:04 hhugo

I did pin a bunch of libs, let me re-pin everything again, indeed maybe I forgot to pin -compiler

ejgallego avatar Apr 23 '20 16:04 ejgallego

Ok, now I got:

Uncaught TypeError: Cannot read property 'charCodeAt' of undefined

in Coq itself. Will switch to debug mode and research more.

ejgallego avatar Apr 23 '20 17:04 ejgallego

I can help you if you need. Note that marshal on bytes will not work properly.

hhugo avatar Apr 23 '20 17:04 hhugo

https://github.com/jscoq/jscoq/blob/v8.11/coq-js/js_stub/str.js#L63 (.c) is possibly incorrect

hhugo avatar Apr 23 '20 17:04 hhugo

Very well spotted indeed (cc: @corwin-of-amber )

Uncaught TypeError: Cannot read property 'charCodeAt' of undefined
    at re_match_impl (:8080/coq-js/jscoq_worker.bc.js:7000)
    at re_string_match (:8080/coq-js/jscoq_worker.bc.js:7783)
    at string_match$0 (:8080/coq-js/jscoq_worker.bc.js:125117)
    at bounded_split$0 (:8080/coq-js/jscoq_worker.bc.js:125343)
    at split$5 (:8080/coq-js/jscoq_worker.bc.js:125393)
    at split_flags (:8080/coq-js/jscoq_worker.bc.js:151618)
    at flags_of_string (:8080/coq-js/jscoq_worker.bc.js:151709)
    at parse_flags (:8080/coq-js/jscoq_worker.bc.js:151765)
    at set_flags (:8080/coq-js/jscoq_worker.bc.js:151783)
    at create$31 (:8080/coq-js/jscoq_worker.bc.js:151859)

that should be easy to patch by myself.

ejgallego avatar Apr 23 '20 18:04 ejgallego

With that patched, jsCoq demo file seems to work fine!

ejgallego avatar Apr 23 '20 18:04 ejgallego

Rebased on top of the new Str support.

hhugo avatar Apr 25 '20 11:04 hhugo

I share @Drup's curiosity with thrill. Do you have a design doc somewhere that describes the move and its implications ? The JavaScript vs OCaml strings has been an annoying technical point of jsoo when designing OCaml APIs that interact with the browser APIs.

Does it mean that if I have a value of type string it is represented by a JavaScript string and I can give it to the DOM apis as is ? And how are string literals treated ?

dbuenzli avatar Apr 25 '20 12:04 dbuenzli

Does it mean that if I have a value of type string it is represented by a JavaScript string and I can give it to the DOM apis as is ? And how are string literals treated ?

In that mode, an ocaml string will be represented as a javascript string containing a sequence of bytes. It would only be safe to pass it to the dom as is for ascii strings. In the general case, you'll still need to convert the string to utf16.

I will update the feature description to explain the design and the plan moving forward.

In short:

  • A previous PR https://github.com/ocsigen/js_of_ocaml/pull/923 made changing strings representation easier by cleaning up the runtime
  • The alternative string representation is currently a plain js string containing bytes
  • The plan is to run multiple bench with different implementations, eventually updating the implementation of string.ml and take a decision based on the numbers.

hhugo avatar Apr 25 '20 12:04 hhugo

Okay so the dichotomy remains. But I guess given the nature of OCaml strings -- namely a sequence of bytes -- there's not much choice.

dbuenzli avatar Apr 25 '20 13:04 dbuenzli

one more step, num will be compatible with that change https://github.com/ocaml/num/issues/21

hhugo avatar Nov 09 '20 17:11 hhugo

At least #984 have been merged.

smorimoto avatar Jan 03 '22 14:01 smorimoto

Is there any bench somewhere for this?

smorimoto avatar Feb 03 '23 15:02 smorimoto

https://ocsigen.org/js_of_ocaml/latest/manual/performances was updated with some numbers.

hhugo avatar Feb 03 '23 15:02 hhugo

Great! Thanks for the pointer!

smorimoto avatar Feb 03 '23 15:02 smorimoto