ocaml icon indicating copy to clipboard operation
ocaml copied to clipboard

Use an opaque abstract type for Dynarray.Dummy.with_dummy

Open bclement-ocp opened this issue 6 months ago • 12 comments

The internal ('a, 'stamp) with_dummy type used by the Dynarray module since #12885 is currently defined as a type alias to the internal 'a type. While convenient, this is a lie, as the ('a, 'stamp) with_dummy type can also contain dummies.

In particular, this is telling (through types) the compiler that an (int, 'stamp) with_dummy array can only contain immediates. This could in theory allow the compiler to perform optimisations such as:

  • Remove the is_dummy checks for these arrays (since we also know through types that dummies are never immediates) ;

  • Remove calls to caml_modify when writing to these arrays

While I don't think these optimisations can currently happen, this patch uses an abstract type for with_dummy so as to prevent issues if they ever start happening in the future.

It also fixes an issue in unsafe_nocopy_to_array for floats where we call unsafe_get before checking for the dummy.

bclement-ocp avatar Jun 13 '25 10:06 bclement-ocp

It also fixes an issue in the (unused) unsafe_nocopy_to_array for floats where we call unsafe_get before checking for the dummy.

I said it's not used, but that's because I removed unsafe_to_iarray (the compiler I had handy did not have iarray). The following bit of code should raise an "unsynchronized concurrent length change" error (because there is a dummy in the array), but doesn't on current trunk, because the dummy is stored in a flat float array before checking it.

type dummy = < >

type 'a t = {
  mutable length : int;
  mutable arr : 'a array;
  dummy : dummy
}

external unsafe_cast : 'a Dynarray.t -> 'a t = "%opaque"

let _ =
  Dynarray.unsafe_to_iarray ~capacity:2 (fun arr ->
    Dynarray.add_last arr 7.0;
    Dynarray.add_last arr (Obj.magic (unsafe_cast arr).dummy)
  )

(Happy to extract this change in a separate PR if requested)

bclement-ocp avatar Jun 13 '25 11:06 bclement-ocp

I don't understand anymore why Dummy.Array.blit does not check for dummies in the source array. This does not seem correct? (Memory-safety is lost if two different dummies end up in the same array; I would assume that blit not checking its input could result in that when the dummies of both arrays are different.) Can you either reassure me that this is correct, or fix the code by adding checks in the different-dummies blit case?

gasche avatar Jun 13 '25 14:06 gasche

I don't understand anymore why Dummy.Array.blit does not check for dummies in the source array. This does not seem correct? (Memory-safety is lost if two different dummies end up in the same array; I would assume that blit not checking its input could result in that when the dummies of both arrays are different.) Can you either reassure me that this is correct, or fix the code by adding checks in the different-dummies blit case?

I wondered the same thing. Dummy.Array.blit is only called by blit_assume_room, and there is this comment that convinced me it was OK earlier:

  (* The caller of [blit_assume_room] typically calls
     [ensure_capacity] right before. This could run asynchronous
     code. We want to fail reliably on any asynchronous length change,
     as it may invalidate the source and target ranges provided by the
     user. So we double-check that the lengths have not changed.  *)

but thinking about it again, I am not entirely sure this actually guarantees that there are no dummies in the source region (if the source array got corrupted earlier). I think adding checks in the different-dummies blit case is the safe option, especially considering that it can only occur with marshalling.

bclement-ocp avatar Jun 13 '25 14:06 bclement-ocp

(My current review state is that I'm waiting to see if @bclement-ocp wants to fix Dummy.Array.blit as part of this PR.)

gasche avatar Jun 18 '25 19:06 gasche

Thanks for the clarification — I was waiting for you to confirm the comment on blit_assume_room was not enough to ensure memory safety.

bclement-ocp avatar Jun 19 '25 07:06 bclement-ocp

(My current review state is that I'm waiting to see if @bclement-ocp wants to fix Dummy.Array.blit as part of this PR.)

Done.

bclement-ocp avatar Jun 19 '25 08:06 bclement-ocp

The following tests are failing:

List of failed tests:
    tests/lib-bigarray-2/bigarrcml.ml
    tests/statmemprof/bigarray.ml

but the errors don't seem related to this PR:

> Action 3/5 (check-ocamlc.byte-output) => failed (The file D:\a\ocaml\ocaml\testsuite\_ocamltest\tests/statmemprof\bigarray\ocamlc.byte\ocamlc.byte.output was expected to be empty because there is no reference file D:\a\ocaml\ocaml\testsuite\tests\statmemprof\bigarray.compilers.reference but it is not:
> ========================================
> bigarray_stubs.c(8): warning C5287: operands are different enum types 'caml_ba_kind' and 'caml_ba_layout'; use an explicit cast to silence this warning

> bigarray_stubs.c(15): warning C5287: operands are different enum types 'caml_ba_kind' and 'caml_ba_layout'; use an explicit cast to silence this warning

> bigarray_stubs.c(22): warning C5287: operands are different enum types 'caml_ba_kind' and 'caml_ba_layout'; use an explicit cast to silence this warning

bclement-ocp avatar Jun 19 '25 09:06 bclement-ocp

Indeed, we are blessed with a new MSVC warning, see my investigation at https://github.com/ocaml/ocaml/pull/14059#issuecomment-2985738799 and a PR in progress at https://github.com/ocaml/ocaml/pull/14097 .

gasche avatar Jun 19 '25 14:06 gasche

(My understanding again is that we are waiting for @bclement-ocp to tweak the PR slightly, which is fine with me.)

gasche avatar Aug 25 '25 14:08 gasche

(gentle ping: I would be happy to be done with this PR, which improves the codebase)

gasche avatar Oct 30 '25 08:10 gasche

I have pushed a new version taking into account your comments. Regarding the naming of unsafe_XXX functions, I went in a different direction: they are still prefixed with unsafe_, but now come with documented safety conditions and better names.

For instance, your question:

what is the difference between unsafe_nocopy_from_array and unsafe_from_array, and why do we have two functions that have morally the same name?

now becomes:

what is the difference between unsafe_nocopy_from_array and unsafe_nocopy_from_non_float_array, and why do we have two functions that have morally the same name?

and I think the answer is now clear (one does not work for arrays of floats, and is used in the implementation of the other one that also works for floats).

I have also removed the subtle logic around using "%identity" where possible, and instead explicitly use "%opaque" everywhere. As you point out, it makes the code harder understand and modify, and I don't think we actually make use of the additional optimisation opportunities — we can revisit this if we someday identify optimisation opportunities that we want to exploit here.

bclement-ocp avatar Oct 30 '25 13:10 bclement-ocp

I'm not sure what you want to do with the git history. My gut feeling is that it is not good enough now, so you should either clean it up or we can just squash into a single commit at merge-time.

gasche avatar Nov 27 '25 14:11 gasche