dmd icon indicating copy to clipboard operation
dmd copied to clipboard

Issue 17954 - init member should be disallowed (deprecation)

Open BorisCarvajal opened this issue 4 years ago • 42 comments

This needs to happen for a sane generic programming.

Declaring tupleof, stringof, min, max and maybe others as members should also be deprecated in the future.

BorisCarvajal avatar May 11 '21 06:05 BorisCarvajal

Thanks for your pull request, @BorisCarvajal!

Bugzilla references

Auto-close Bugzilla Severity Description
17954 enhancement init member should be disallowed

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#12512"

dlang-bot avatar May 11 '21 06:05 dlang-bot

You'll have to fix these fist

std/encoding.d(851): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(931): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1007): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1087): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1180): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1275): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(851): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(931): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1007): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1087): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1180): Deprecation: declaring a member named `init` is deprecated
std/encoding.d(1275): Deprecation: declaring a member named `init` is deprecated
std/socket.d(89): Deprecation: declaring a member named `init` is deprecated
std/socket.d(89): Deprecation: declaring a member named `init` is deprecated
std/experimental/allocator/typed.d(31): Deprecation: declaring a member named `init` is deprecated
std/experimental/allocator/typed.d(31): Deprecation: declaring a member named `init` is deprecated

thewilsonator avatar May 11 '21 06:05 thewilsonator

Yes, they are just a renaming of a first enum member.

BorisCarvajal avatar May 11 '21 06:05 BorisCarvajal

The commit message should contain "Fix" so that the bot picks up the issue.

RazvanN7 avatar May 11 '21 07:05 RazvanN7

cc @atilaneves @WalterBright

RazvanN7 avatar May 11 '21 07:05 RazvanN7

The commit message should contain "Fix" so that the bot picks up the issue.

I wanted to link it only.

BorisCarvajal avatar May 12 '21 01:05 BorisCarvajal

It seems that the current behaviour is intended at least for enums, please account for that.

thewilsonator avatar May 12 '21 02:05 thewilsonator

I'll leave this one to @WalterBright

atilaneves avatar May 12 '21 06:05 atilaneves

This needs to happen for a sane generic programming.

Declaring tupleof, stringof, min, max and maybe others as members should also be deprecated in the future.

I never thought this was possible until I saw changes about this. Is this described on the spec? Do we need to change it? In any way, I think this is a must.

ljmf00 avatar Mar 25 '22 19:03 ljmf00

I never thought this was possible until I saw changes about this. Is this described on the spec? Do we need to change it? In any way, I think this is a must.

I couldn't find any mention in the spec.

From reading Walter's comments in Issue 7066 and Issue 1412 they were supposed to be user customizable. But in reality, in the current state of the language, redefining any builtin property is a call to break well-established code, specially generic one like in phobos.

Some properties:

  • init
  • stringof
  • tupleof
  • offsetof
  • ~sizeof~ (these already cannot be redefined)
  • ~alignof~
  • ~mangleof~

Enum (integral type only?):

  • min
  • max

Class:

  • classinfo
  • outer

I think changing any of them could cause some form of unexpected behavior. What do you think?

BorisCarvajal avatar Mar 26 '22 10:03 BorisCarvajal

I never thought this was possible until I saw changes about this. Is this described on the spec? Do we need to change it? In any way, I think this is a must.

I couldn't find any mention in the spec.

From reading Walter's comments in Issue 7066 and Issue 1412 they were supposed to be user customizable. But in reality, in the current state of the language, redefining any builtin property is a call to break well-established code, specially generic one like in phobos.

Right. My point was more to double confirm it, because IMO I have always seen this as impossible and never thought about overriding those values. In my view, it doesn't make much sense to do it.

I think changing any of them could cause some form of unexpected behavior. What do you think?

I totally agree that overriding those can break well-established code and we should move effort towards removing those. This will particularly be a problem on some Phobos traits and templated functions that uses those built-in properties or more problematic, some templated compiler hooks on the runtime, which will give very weird behaviour.

Since this is not mentioned in the spec, I would just consider it a compiler bug but given that there might be some people accidentally using it, we probably may consider doing a deprecation?

I think we should deprecate this all at once.

ljmf00 avatar Apr 29 '22 14:04 ljmf00

min/max are very common/universal function names, making them reserved is a mistake in my opinion

D already block certain combo due to certain names being already used, example:

  • init/deinit is a common combo, but with D, we can't use it since init is supposed to be reserved (apparently)

  • create/destroy is also a common combo, but with D we can't use it since destroy is already used in object.d

Perhaps a better way to define and call builtin functions should be proposed to be future proof?

ryuukk avatar Feb 19 '23 04:02 ryuukk

min/max are very common/universal function names, making them reserved is a mistake in my opinion

I would tend to agree here. For example we have opaque enums that are treated specially as another underlying type. When the basetype is a struct, one can define their own properties to have parity with basic types.

A working example would be enum __c_complex_float = _Complex!float;

ibuclaw avatar Feb 19 '23 07:02 ibuclaw

Just got a ping on this bug report that I filed in 2017. This seems great, is there any chance to get it in?

The contextual keywords aside from .init are surprising (especially .tupleof). I have mixed feelings on those, but also can see why they should also be disallowed.

But .init is so critical to so much generic programming, I think it's worth having it at least contextually reserved.

schveiguy avatar Apr 28 '23 13:04 schveiguy

http://www.suodenjoki.dk/us/archive/2010/min-max.htm

reserving keywords is a mistake it is mind boggling that you repeat the same mistakes as C++, worse, people encouraging you to do so i will never understand this mindset

ryuukk avatar Apr 28 '23 14:04 ryuukk

http://www.suodenjoki.dk/us/archive/2010/min-max.htm

reserving keywords is a mistake

That's not even close to the same problem. In that case, the max symbol is being replaced in text (by the C preprocessor). If, for instance, we wanted to reserve max for enum types, it wouldn't even apply to std::max which would be a module-level function.

With the CPP situation, max is just replaced everywhere, no matter the context.

It is mind boggling that you repeat the same mistakes as C++, worse, people encouraging you to do so i will never understand this mindset

We are trying to correct an issue where a compiler property that should have been reserved is not reserved.

If you want to do compiler introspection, then you need to know what you are doing is compiler introspection, and not hijacked by a clever type definition. Otherwise, writing generic code is a nightmare. We should focus on making generic code easy to write, that's D's strength.

There are admittedly different ways to solve this. One is by using a keyword entry point such as __traits. Another is by using a reserved syntax. Finally, you can use keywords (this is not exactly a keyword, but a context-specific reservation of a symbol). Doing the latter means millions of lines of existing code doesn't break. That has to be taken into consideration.

schveiguy avatar Apr 28 '23 15:04 schveiguy

http://www.suodenjoki.dk/us/archive/2010/min-max.htm reserving keywords is a mistake

That's not even close to the same problem. In that case, the max symbol is being replaced in text (by the C preprocessor). If, for instance, we wanted to reserve max for enum types, it wouldn't even apply to std::max which would be a module-level function.

With the CPP situation, max is just replaced everywhere, no matter the context.

It is mind boggling that you repeat the same mistakes as C++, worse, people encouraging you to do so i will never understand this mindset

We are trying to correct an issue where a compiler property that should have been reserved is not reserved.

If you want to do compiler introspection, then you need to know what you are doing is compiler introspection, and not hijacked by a clever type definition. Otherwise, writing generic code is a nightmare. We should focus on making generic code easy to write, that's D's strength.

There are admittedly different ways to solve this. One is by using a keyword entry point such as __traits. Another is by using a reserved syntax. Finally, you can use keywords (this is not exactly a keyword, but a context-specific reservation of a symbol). Doing the latter means millions of lines of existing code doesn't break. That has to be taken into consideration.

Zig's way solve all the problems, compiler doesn't invade user space https://ziglang.org/documentation/master/#Builtin-Functions

Separation of concerns is what makes something future proof, not doubling down on poor design

ryuukk avatar Apr 28 '23 16:04 ryuukk

Good on Zig (using a reserved syntax as I mentioned). We have what we have and aren't going to change it to break all code in existence just because it's a better design. Existing code has to factor in.

schveiguy avatar Apr 28 '23 20:04 schveiguy

I have rebased and addressed @ibuclaw 's comment with regards to the deprecation comment.

I think that it is unfortunate that we need to reserve init, but it is so intimately coupled with generic programming that I don't see any other way.

@WalterBright @atilaneves ultimately it is up to you to decide on this.

RazvanN7 avatar May 01 '23 11:05 RazvanN7

@ryuukk I agree that reserving keywords as such is bad, however, right now people are relying that T.init has a specific meaning. Any aggregate declaration that overwrites init in the way that it is permitted is not going to be usable with most of the templated library functions that exist. If we want to un-reserve init we need to come up with a clever solution and that's not completely off the table, but up until then the best of the worst solutions is to deprecate usage of init as an aggregate member.

Note: For brevity I am only referring to aggregates although this PR also affects named enums.

RazvanN7 avatar May 01 '23 11:05 RazvanN7

There are certainly advantages and disadvantages to this. The idea with a user-definable .init is not to impair generic programming, but to allow the user to specify custom initialization behavior, and to allow custom types to behave like other types. I.e. to enhance generic programming.

Similarly, it's like writing a function named strlen in C. It's going to prevent the strlen from the C library from being linked in, and if it doesn't do what the C library strlen does, then havoc will ensue. I've never encountered any C advice that says don't name a function the same as any in the C library, it's just something people know. strlen is not a reserved word. But if you write a functional but better strlen, you can - and people do. Ditto for malloc, and whole industries have sprung up to replace malloc.

D expects that if one writes a member function named init, that it will do what is expected of init from the caller's point of view. It isn't just init, the same goes for empty, front, back, popFront, popBack, so people can implement ranges, or range-like types and have them work with foreach.

Zig was mentioned as having a special syntax for builtin properties - but none of this would work if the user could not create custom implementations of those properties.

A significant issue for init, etc., is we simply have failed to document it properly. That's my fault. There should be a page in the spec that lists all the special properties, and what they are expected to do.

Additionally, I suspect that deprecating init will impact quite a lot of existing code, and will break code that relies on a custom implementation of it without an obvious solution.

On balance, the current behavior is the best way forward, although an action item will be to document it much better.

WalterBright avatar May 10 '23 01:05 WalterBright

Should this be added to the documentation?

`T.init` should not be used to access the compiler-defined initializer of a type. Instead you should use `() {T t; return t;}()`

We need to update a lot of Phobos and druntime to remove this assumption.

We also should define what .init is supposed to be used for, especially if the compiler isn't using it. As of your comment, I have no idea the utility of .init at all. It seems a worthless property.

schveiguy avatar May 10 '23 01:05 schveiguy

.init is the default initializer for the type.

WalterBright avatar May 10 '23 01:05 WalterBright

Don't we already have a way to customize initialization behavior with default initializers for aggregate fields and enums? Is there anything a custom init can do that isn't already covered by default initializers?

Additionally, I suspect that deprecating init will impact quite a lot of existing code, and will break code that relies on a custom implementation of it without an obvious solution.

The only place I've seen an init method in the wild is void init() in this project from Buildkite, which is an instance method used to initialize an object, not a class method used to provide a custom initial value—i.e., it's not being used as intended.

Maybe as a compromise we can at least forbid non-static init members?

pbackus avatar May 10 '23 03:05 pbackus

.init is the default initializer for the type.

struct S
{
   int x;
   enum init = S(1);
}

void main()
{
   S s;
   assert(s.x == 1); // fails
   auto arr = new S[5];
   assert(arr[0].x == 1); // fails
}

If the language is supposed to respect the custom init value defined by users, then shouldn't both of these asserts pass?

schveiguy avatar May 10 '23 12:05 schveiguy

If the language is supposed to respect the custom init value defined by users, then shouldn't both of these asserts pass?

Yes, you have identified a bit of a conundrum. The thing is, D initializes data before user code can access it. Structs are initialized before any constructor is run, ditto for classes. (This is quite unlike C++.) It's why field default initializers must be constant expressions. This is all done to prevent programs from reading uninitialized data.

But, consider:

T t;
T u = T.init;

If the user does not define a T.init, then t and u will be initialized to the same value. But if the user does, then t and u will be initialized to different values.

If .init becomes reserved, then the user would be unable to override .init for generic code. If generic code relies on default initialization, it can explicitly use .init to pick up a user-defined default.

This should be better documented.

WalterBright avatar May 10 '23 19:05 WalterBright

import core.stdc.stdio;

struct S
{
    int f = 2;
    enum init = S(3);
}

int main()
{
    S s;
    printf("%d\n", s.f);
    S t = S.init;
    printf("%d\n", t.f);
    return 0;
}

prints:

2
3

WalterBright avatar May 10 '23 22:05 WalterBright

Should this be added to the documentation?

`T.init` should not be used to access the compiler-defined initializer of a type. Instead you should use `() {T t; return t;}()`

Please don't. This is utterly confusing for newcomers and defeats the initial purpose of the .init special property.

ljmf00 avatar May 14 '23 14:05 ljmf00

I think we should:

  • Disallow overriding .init
  • Allow user-defined default constructors

Overriding .init:

  • Does work not properly (as demonstrated in the comments above)
  • Will likely break generic code, which assumes that T.init yields an instance of T with all fields initialized as declared in the struct/class declaration.
  • Is useless, when the struct author can specify a different initializer than F.init for each field F

On the other hand, user-defined default struct constructors often come up in practice and are a clean way for the struct author to define what the default value of that struct is supposed to be.

PetarKirov avatar May 14 '23 21:05 PetarKirov

Given all the pushback we're receiving for removing/altering existing features, we cannot move forward with this.

WalterBright avatar Jun 15 '23 08:06 WalterBright