codept icon indicating copy to clipboard operation
codept copied to clipboard

Avoid modulizing filename when reading file

Open jonahbeckford opened this issue 3 months ago • 3 comments

The original behavior was to modulize the filename in Read.file as a convenience to the caller. Then the caller Unit.read_file would ignore the modulize filename.

This behavior is a bug because the codept user may have supplied a valid Namespaced.t for the file. The result would be:

 Invalid_argument("Impossible to modulize \"_init\" (from \"_init.ml\")")
 Raised at Stdlib.invalid_arg in file "stdlib.ml", line 30, characters 20-45
 Called from Unitname.modulize in file "lib/unitname.ml", line 16, characters 2-109
 Called from Read.file in file "lib/read.ml", line 49, characters 13-26
 Called from Unit.read_file in file "lib/unit.ml", line 49, characters 20-43
  1. Now a new Read.file_raw function is available that avoids the modulizing of the filename.

  2. These callers do not need the modulizing so they have switched to Read.file_raw:

  • Unit.read_file
  • Single.to_m2l

Testing. Instead of using the default Io.direct : Io.t which uses reader.m2l = Unit.read_file (which uses Pkg.local), I have a custom version of Io.t that uses Read.file_raw but doesn't use Pkg.local. I have to avoid Pkg.local because that also modulizes the filename, and I don't have an easy fix for that.

jonahbeckford avatar Mar 26 '24 20:03 jonahbeckford

It does sound like there is something fishy going on, I would try to find the time to think about the issue this week.

Octachron avatar Apr 01 '24 19:04 Octachron

It comes from this line: https://github.com/Octachron/codept/blob/36bbd99fc88bf91fc25c052c72e8d96ed233ba5e/lib/unitname.ml#L14-L15

It seems that _ is not really allowed as the first character of a module name:

$ touch _init.ml
$ ocamlopt _init.ml
File "_init.ml", line 1:
Warning 24 [bad-module-name]: bad source file name: "_init" is not a valid module name.

dinosaure avatar Apr 15 '24 13:04 dinosaure

Yes, I understand that _init can't be used in a module name. That is why I was using it! I have special filenames that I never want confused for regular OCaml modules.

codept lets us supply a Namespaced.t that is completely distinct from the filename (at least that is what the library API implies). My example would be:

SomeCode_Std/
   ├──   _init.ml   | Namespaced.t = SomeCode_Std.Init'
   └──   a.ml       | Namespaced.t = SomeCode_Std.A

From earlier ...

This behavior is a bug because the codept user may have supplied a valid Namespaced.t for the file.

Yes, I have a valid Namespaced.t that I want to pair with _init.ml.

One "solution" would be for the Read.file to ask for the Namespaced.t. But Read.file never needed the module name in the first place. More generally, nothing should be invoking lib/unitname.ml when a Namespaced.t is available.

(I can live with using valid OCaml filenames ... it is not ideal but it is not the end-of-the world. But this PR/issue is a signal that codept is doing unnecessary work.)

jonahbeckford avatar Apr 17 '24 04:04 jonahbeckford