DIPs
DIPs copied to clipboard
Evaluate Pure Functions With CTFE
Inspired by Don Clugston's idea to add __ctfe
as a function attribute.
This should not need a DIP.
It is effectively an optimization ran by the frontend to minimize code that is ran at runtime.
From the spec:
It can enable optimizations based on the fact that a function is guaranteed to not mutate anything which isn't passed to it.
One advantage of Don Clugston' idea is that it allows to disable code generation for functions that are only used at compile-time. When making parts of libraries (including Phobos) BetterC compatible (meaning removing link-time dependency), one trick that I've used is the following (attempt 2):
// before:
string calc(int a, int b) { /* ... */ }
// ...
enum c = calc(1, 2);
// Result: `calc` is emitted in the binary - bad.
// Attempt 1:
enum c = (int a, int b) { /* body of calc */ }(1, 2);
// Result:
// a) `calc` is not part of the binary - good
// b) `calc` is no longer reusable code - bad
// Attempt 2:
enum calc(int a, int b) = { /* body of calc */ }();
enum c = calc!(1, 2);
// Result:
// a) `calc` is not part of the binary - good
// b) `calc` is still reusable code - good
// c) `calc` is now a template which:
// c.1) allows memoization - good
// c.2) for many distinct template arguments is slower than plain ctfe - bad.
I think the best solution would be to use regular functions, but for the language to recognize a pattern like assert(__ctfe);
and disable code generation for functions that begin with such pattern. That way we don't add new attributes to the language and give extra semantic meaning to an already valid statement.
Another option is for the language to guarantee that functions marked with pragma(inline, true)
are not emitted separately and to forbid their address from being taken.
I think adding pragma(ctfe, true)
would be a much better option. This basically would force the compiler to perform CTFE and disallow taking the function address, and not overload the inline
semantics.
A pragma sounds nice, too. It would solve the composition problem that this DIP has:
Consider two pure functions f(int)
and g(int)
. Then f(42)
is CTFE, g(42)
is CTFE, but f(g(42))
is not CTFE which is odd.
Got to add that determining CTFEability based on the function and call seems at least on the surface a more principled approach than a pragma. Also, it will CTFE existing code.
A pragma sounds nice, too. It would solve the composition problem that this DIP has: Consider two pure functions |f(int)| and |g(int)|. Then |f(42)| is CTFE, |g(42)| is CTFE, but |f(g(42))| is not CTFE which is odd.
There is no composition problem.
f(g(42))
Doing constant folding depth first, which is how it is done now and how it's done everywhere else is:
- Look at f(g(42))
- Evaluate arguments to f()
- Look at g(42)
- Look at arguments to g()
- It's 42! Evaluate g(42)
- Argument to f() is a literal, evaluate f(literal)
One advantage of Don Clugston' idea is that it allows to disable code generation for functions that are only used at compile-time.
The linker is supposed to remove functions that are never called.
@WalterBright cool, all I'm saying is that the depth-first folding would be good to mention in the DIP in addition to here. Just paste your text there! :)
One advantage of Don Clugston' idea is that it allows to disable code generation for functions that are only used at compile-time.
The linker is supposed to remove functions that are never called.
The issue is that the linker thinks those functions are used and as a result, betterC programs fail to link when they use phobos templates such as std.typecons.Tuple
which don't have any intrinsic link-time dependencies and should just work. I'll link one of the Bugzilla issues we had to fix for context.
Edit:
- https://github.com/dlang/phobos/pull/5952#discussion_r158469921
- https://issues.dlang.org/show_bug.cgi?id=18101
It recently occurred to me that this is sort of a breaking change with respect to unit tests and test coverage. Many unit tests use literals, and if they are testing pure functions then those tests will be constant folded once this DIP is active. Since code run during ctfe is not traced, the coverage amount will suddenly drop massively for certain projects.
Furthermore, some code paths will actually not be tested anymore. Imagine you are testing an inline asm function:
int fooSoft() {return 0;}
int fooAsm() {asm{naked; mov RAX, 0; ret;}}
int foo() {return __ctfe ? fooSoft() : fooAsm();}
unittest {
assert(foo() == 0); // expected to cover fooAsm
static assert(foo() == 0);
}
Suddenly bugs in fooAsm
won't be detected anymore.
I think the DIP should address this.
About the Prior Work section: Isn't C++'s consteval exactly what Don Clungston suggested __ctfe
as a function attribute to do?
@PetarKirov wrote:
I think the best solution would be to use regular functions, but for the language to recognize a pattern like assert(__ctfe); and disable code generation for functions that begin with such pattern.
I'd say, recognize in (__ctfe)
contracts as a pattern. It's more visible than any assert
inside the body. A body can be hidden, but that way, it's part of the signature; basically making in (__ctfe)
is a pseudo-attribute. Current code could theoretically rely on in (__ctfe)
functions hanging around at run-time, i.e. taking their address or other shenanigans like calling them in code that's in fact never executed.
The biggest difference of recognizing in (__ctfe)
versus adding a __ctfe
an attribute is that overloading based on it is not possible. Whether this difference is good or bad certainly is opinion. Apart from not needing a grammar change and a more than necessary language change, I think recognizing the contract in (__ctfe)
is better than adding the attribute __ctfe
because of that overloading stuff. If you need a function to act differently based on compile-time function evaluation, put if (__ctfe) { } else { }
in it. The only win I see is that using overloading, you could hide the non-ctfe function's code.
I'm concerned that automatically evaluating pure calls with literals at compile-time may slow down compilation. How about only doing this when -inline
is passed to the compiler? There could still be some per-function way of turning this off even with -inline
, if needed.