dune
dune copied to clipboard
Add include of generated files
This branch tries to add features needed for #3901 . Here my first difficulties:
- The source tree is context independent, e.g what is a vendored directory doesn't depend on the
Context.t
.- I think it is a good thing, but it means that generated included files should not include
(vendored ...)
stanza since the file is generated in a context
- I think it is a good thing, but it means that generated included files should not include
- The parsing of all the dune files is done in one call
Dune_load.load
for all the directories. So the generation of the included files creates a cycles with all the dune files, so with the inclusion of their content. Two possibilities:- Separate the loading of the dune files by directory.
- Include the files later, for example when the rules are evaluated, it could restrict even more the possible stanza but it would change less things.
Currently instead of extending the (include file)
stanza I added (include_generated file)
in order to simplify the developement of the feature. And for the user it helps to separate the two cases.
Currently instead of extending the (include file) stanza I added (include_generated file) in order to simplify the developement of the feature. And for the user it helps to separate the two cases.
That seems reasonable. Perhaps even guard the feature under a (using include_generated 0.1)
in the dune-project
file? So that we don't have to commit to stability and versioning too early.
I'm going to dissent against having a separate stanza. There's a couple of reasons why I don't like it:
- We aren't offering anything new conceptually, only removing an existing limitation. Why should we have 2 separate stanzas for the same construct in the language?
- It is a bad UI for end users. Now they'll need to churn their dune files from include generated sources.
Perhaps even guard the feature under a (using include_generated 0.1) in the dune-project file? So that we don't have to commit to stability and versioning too early.
This can be done without a separate stanza. It seems simple to guard the feature behind a Path.Source.is_in_source_tree
check when it's disabled.
This can be done without a separate stanza. It seems simple to guard the feature behind a
Path.Source.is_in_source_tree
check when it's disabled.
I agree with these arguments.
Do you still prefer the same stanza if one accept to include stanzas that the other do not? If people prefer the same stanza, I will move to it once the other questions are resolved.
Currently I'm a little at loss to find the right place or right way to break the circular dependency. Before Super_context
would be great so that the include_generated
is transparent, but the rules are generated by directory later. So it should be in Gen_rules
but we already filtered stanzas, the stanzas are provided in the API of Super_context
, and there is the add stanza in subdirectory. Do you have some ideas?
Do we have a mode in the build system, where the need to build something would be an error? It is for dune install
it would need to check that includes generated are up-to-date but not build them. However stanzas that install are not the priority of #3901 so they can be forbidden in generated includes.
You might need to ignore include_generated
(or just include
with a missing source file if we use the same stanza) during the first loading stage. And only evaluate and interpret it in gen_rules.ml
when we actually produce the rules for the directory.
Eventually, we should break down Super_context
BTW. This thing is ugly and pre-dates Memo
.
Ok I did nearly that but I had to touch all the stanzas_in
related stuff because it becames Memo.Lazy.t
. It should be for the better. And it works sufficiently well for #3901 (tested with rules and local executable).
The generated dune
file can't add things to the library database and artefact database (public executables).
A note, in main branch include
stanza are interpreted in Sub_dirs
(for plain dune file) and in Dune_file
(only for OCaml code dune file). The only remaining advantage of the last one are that their generated stanza can be part of the databases.
Good for review
The generated
dune
file can't add things to the library database and artefact database (public executables).
Maybe not in this PR, but do you see a way to lift the restriction about libraries? This is one of the use-cases I had in mind for this feature...
Lifiting the restriction for libraries is tricky for the implementation, but already at an high level there are traps. Indeed the set of available library would potentially change at each new dune file parsed. So it could change the meaning of %{lib-available:foo}
. So such replacement and other optionality tests would wait until all the dune file are parsed to return false, but could return true
as soon as the library is available. So it could be possible but making such library DB would be a great feat.
There are other questions:
- include a file generated from a subdirectory
(include subdir/generated-dune)
is exclusive with env stanza present in generated dune file. Indeed the generation insubdir
ofgenerated-dune
need an environment which is computed by taking into account all the environnement stanza present in parents directory. - subdir stanza can't be present in generated dune file because it is currently computed quite early, so we can't automatically generate rules in a tests hierarchy not statically known (promotion can still be used for this use case). At an high level point of view the difficulty is to know when all the stanza of a directory are known. We could force the subdirectories to wait for parents directory. It would forbid even more the
(include subdir/generated-dune)
. -
Dir_status
has the same difficulty than environments
The summary is that Super_context
and before (e.g Subdir.t
) are still too much not Memoized for having a clear picture but the high level view is that the generation of the stanzas of a directory depends on the generation of the stanzas of parents directory and only (include ../generated-dune)
are acceptable.
Personally, I find some things to be quite confusing about this PR:
-
The interaction between include and subdirs. Is it ok to generate subdir stanzas? Is it ok to have generated include in a subdir stanza?
-
The distinction between parse and parse_generated. Why is this necessary? I would expect a single code path.
-
How does this work with
-p
?
Even before this PR, the code had some serious issues IMHO:
- The handling of
include
is hacky and must be duplicated - Untracked IO is done on source file.
I don't think we need to solve all the technical problems to make this experimental feature complete, but we should at least not make the current code even more difficult to understand.
Personally, I find some things to be quite confusing about this PR:
I agree that the scope of what we can do with generated includes is not very clear and very implementation dependent. Mainly because the first parts of the handling of dune file are not Memo aware, and so are monolithic.
The interaction between include and subdirs.
Is it ok to generate subdir stanzas?
No it will not work because subdir stanzas are handled before, for all the hierarchy at the same time. Perhaps reworking it like source_tree
and env
where the computation of the sub-directories just depends on the parents directory would solve most of the problems.
Is it ok to have generated include in a subdir stanza?
Yes
The distinction between parse and parse_generated. Why is this necessary? I would expect a single code path.
I did that at first, but since the first part of parsing is monolithic, it was complicating things without any advantage. I preferred not to complicate this part before an overall.
How does this work with -p?
If an executable needed for generating an include file is marked for a package not selected, the generation will fail. Usually such generation are done using private binaries. It is an additional limitation. We could add (package ...)
to include in order to limit the generation when the package is selected, but for another PR it is very minor.
Even before this PR, the code had some serious issues IMHO:
I think an overall is needed but I need inputs from the authors to understand if I understand it correctly.
Meeting notes:
- The current approach offer too much restrictions, adding some limited scope and staging would be better
- Prefered approach:
- Add a
(scoped_dir (dir <path>) (dune_file <path>) (exports (library <name>) (executable <name>)(package <name>))
- The directory
dir
is not traversed until thedune_file
has been generated and parsed - With
dune_file
taken into accountdir
must define thelibrary
andexecutable
declared exported, and could define install stanza forpackage
.
- The directory
- Add a
@rgrinberg @jeremiedimino Source_tree doesn't depend on the context (or build_context), right? So it seems that adding the support of scoped_dirs
which depends on the build_context since the generation of the file depends on it requires more changes if done in Source_tree
. Do you have some advice?
EDIT: Why does Build_system depends on Source_tree?
@rgrinberg @jeremiedimino Source_tree doesn't depend on the context (or build_context), right?
That's right.
So it seems that adding the support of scoped_dirs which depends on the build_context since the generation of the file depends on it requires more changes if done in Source_tree. Do you have some advice?
The new feature I'm working on should help (see my comment on slack). In the meantime, I suggest to focus on scoped dirs that don't depend on the build context.
@christinerose we decided to abandon the feature in its current form, so it's probably not worth reviewing the current PR.
@bobot why do you want the scoped_dir
status to depend on the build context BTW? Whether a directory is a scoped directory or not seems more like a universal property to me.
EDIT: Why does Build_system depends on Source_tree?
For various reasons. For instance, some internal behaviours of Dune are tied to the (lang dune x.y)
version, and this version is obtained via the source tree. Though it doesn't have to: it could be returned by gen_rules
or another callback.
why do you want the scoped_dir status to depend on the build context BTW?
The generation of the files used by scoped_dir
depends on the context, so since scoped_dir
would allow to generate full featured dune_files. The presence of the stanza scoped_dir
but also vendor
, data_dir
stanzas in a scoped directory depends on the context.
, I suggest to focus on scoped dirs that don't depend on the build context.
I can try to start with an implementation of scoped_dir which disallow stanzas that change the status in the scope directory (= the attempt with include generated_dune_file
+ subdir stanza).
That makes sense. To be fair, it does feel like we are stuffing Source_tree
too much. Things like the vendored status or cram tests don't have much to do with Source_tree
. Maybe even the dune-project
. All this could be part of src/dune_rules
.
Indeed, the context independent and context dependent information should be separated inside (or from) Source_tree
. I'm wondering if the context independent information is just Fs_cache
.
@christinerose we decided to abandon the feature in its current form, so it's probably not worth reviewing the current PR.
Should I dismiss the pending reviews then, or is this for new changes?
@christinerose thank you for your review, you do not have to dismiss them. I will use your review for the documentation of the new feature.
Great, @bobot. Thanks!
Indeed, the context independent and context dependent information should be separated inside (or from) Source_tree. I'm wondering if the context independent information is just Fs_cache.
I think you mean Fs_memo
. Fs_cache
is a lower-level layer used to implement Fs_memo
. But agreed that Source_tree
might eventually fizzle out. Or at least move to dune_rules
with a slight different API, since most of it is now specific to the Dune rules.
@bobot can you rebase? I'd like to take another look at this PR. I think it should be somewhat cleaner now that we've removed unmemoized file access
@bobot ping, are you planning to make progress with this?