Allow the same .stanfunctions file to be `#include`d multiple times
This came up at StanCon as a common pain point when trying to organize groups of functions.
Suppose you have functions foo, bar, and baz living in files of the same names. foo is used by bar and by baz.
If you #include bar.stan and #include foo.stan in the same file, then you will end up with two (transitive) includes of foo.stan, and this will lead to a typechecker error for re-definition of the function foo.
Some solutions:
-
Add something like
#pragma oncein C++. This is the most explicit version, but it is probably my least favorite since it introduces a whole new verbiage -
Automatically de-duplicate files that end with
.stanfunctions. Note that we don't want to deduplcate arbitrary#includestatements since they may not actually live in the same scope and therefore it would be okay. Therefore, restricting to .stanfunctions file extensions ensures this is only doing something where it matters. Downside: introduces a semantic difference in the language based on the file extensions used. -
Allow duplicate function definitions if both definitions are from the same ultimate file. This would require some extra location logic in the typechecker.
Add something like #pragma once in C++. This is the most explicit version, but it is probably my least favorite since it introduces a whole new verbiage
Is there a reason we need something like #pragma once and not just directly using #pragma once directly? We could just do ifdefs for .stanfunction files
Mostly on a design level, I think adding more stuff from the C/C++ preprocessor would be a mistake
Do you mean adding preprocessor stuff to the AST/MIR? What if we just had info for whether a file was a .stanfunction file? Then the c++ mir can use that for adding the pragma but any other backend could just ignore that info if they did not want to use it
Oh I misunderstood, I was saying I didn't want to introduce #pragma once into the user-facing Stan language.
We can't use it during actual code gen because we're only generating one file, in the end. We do the preprocessing ourselves to end up with one unit, which we need for typechecking etc anyway
The ideal solution would be to design a proper structure-aware import system and deprecate "C preprocessor" style textual #includes.
For example, let's say
import "otherfile.stan";
at the start of a file parses "otherfile.stan" and (effectively) prepends every block to the corresponding block in this file, without duplicating files that are imported in multiple ways. Would that cover all or majority of use cases for #include?
I think a fair number of use cases in R packages use includes as essentially a crude macro system to duplicate a chunk of code in several places in the same file, which would obviously not be something an import could do. But overall I do like the idea of moving away from text-based preprocessing