stanc3 icon indicating copy to clipboard operation
stanc3 copied to clipboard

Allow the same .stanfunctions file to be `#include`d multiple times

Open WardBrian opened this issue 1 year ago • 6 comments

This came up at StanCon as a common pain point when trying to organize groups of functions.

Suppose you have functions foo, bar, and baz living in files of the same names. foo is used by bar and by baz.

If you #include bar.stan and #include foo.stan in the same file, then you will end up with two (transitive) includes of foo.stan, and this will lead to a typechecker error for re-definition of the function foo.

Some solutions:

  • Add something like #pragma once in C++. This is the most explicit version, but it is probably my least favorite since it introduces a whole new verbiage

  • Automatically de-duplicate files that end with .stanfunctions. Note that we don't want to deduplcate arbitrary #include statements since they may not actually live in the same scope and therefore it would be okay. Therefore, restricting to .stanfunctions file extensions ensures this is only doing something where it matters. Downside: introduces a semantic difference in the language based on the file extensions used.

  • Allow duplicate function definitions if both definitions are from the same ultimate file. This would require some extra location logic in the typechecker.

WardBrian avatar Sep 23 '24 15:09 WardBrian

Add something like #pragma once in C++. This is the most explicit version, but it is probably my least favorite since it introduces a whole new verbiage

Is there a reason we need something like #pragma once and not just directly using #pragma once directly? We could just do ifdefs for .stanfunction files

SteveBronder avatar Sep 24 '24 20:09 SteveBronder

Mostly on a design level, I think adding more stuff from the C/C++ preprocessor would be a mistake

WardBrian avatar Sep 24 '24 21:09 WardBrian

Do you mean adding preprocessor stuff to the AST/MIR? What if we just had info for whether a file was a .stanfunction file? Then the c++ mir can use that for adding the pragma but any other backend could just ignore that info if they did not want to use it

SteveBronder avatar Sep 25 '24 14:09 SteveBronder

Oh I misunderstood, I was saying I didn't want to introduce #pragma once into the user-facing Stan language.

We can't use it during actual code gen because we're only generating one file, in the end. We do the preprocessing ourselves to end up with one unit, which we need for typechecking etc anyway

WardBrian avatar Sep 25 '24 16:09 WardBrian

The ideal solution would be to design a proper structure-aware import system and deprecate "C preprocessor" style textual #includes.

For example, let's say

import "otherfile.stan";

at the start of a file parses "otherfile.stan" and (effectively) prepends every block to the corresponding block in this file, without duplicating files that are imported in multiple ways. Would that cover all or majority of use cases for #include?

nhuurre avatar Sep 25 '24 16:09 nhuurre

I think a fair number of use cases in R packages use includes as essentially a crude macro system to duplicate a chunk of code in several places in the same file, which would obviously not be something an import could do. But overall I do like the idea of moving away from text-based preprocessing

WardBrian avatar Sep 25 '24 18:09 WardBrian