corto icon indicating copy to clipboard operation
corto copied to clipboard

Define rules for conversion of primitives and overloading

Open jleeothon opened this issue 10 years ago • 2 comments

Based on a conversation between @Seldomberry/owners:

Important factors to take into consideration:

  • What are the needs of the intended primary users of Hyve?
  • Can we define the permitted conversions or casts based on simple rules?
  • How does this affect the ability to overload and resolve procedures?

Explicit casts

Most explicit casts between primitive types should be defined, though some of them would fail in runtime. @SanderMertens correct me if I'm wrong, these correspond to the conversions handled by db_convert. I've tried writing down how I believe that this stuff works or how I believe it should work, etc, but if it's already done, maybe @SanderMertens may want to document the spec (later) instead... and we can discuss about it.

From number types

Between int and uint types, casts preserve an exact copy of the least significant bits and the most significant bits may be lost when casting to a smaller type. When signed ints are promoted to a wider type, padding is done depending on the sign, when unsigned ints are cast to wider signed ints, padding is always zeroes.

Casting integer types to floating point values can be more tricky to think about from scratch. However, it's something that virtually all languages can do. TODO spec.

In summary, I wonder if we can follow plain C rules to convert between integer and float types. We may want to do research on how the standard works and if it is fully specified. Also, let's find out if other mainstream languages, especially Java and C# do similar or different stuff. We probably don't want to neither reinvent it all nor do it too differently.

Similar rules apply from integer types to binary types, i.e. an exact copy is done. Signed integers are taken literally and not interpreted as positive or negative numbers.

Similar rules also apply to cast integers to bitmasks. The resulting bitmask may not always be a bitmask value that could be achieved with bitmask constants alone. For example:

bitmask color: red, green, blue
color white: color::red | color::green | color::blue // underneath, looks like 00000111
color otherwhite: color(-1) // underneath looks like 11111111

The above may be a desirable use case as an idiom for "set everything".

Signed and unsigned integer types can be converted to enums by resolving to the appropriate constant. However, because we might not have 2^32 different constants we may want the conversion to fail in runtime.

Floating-point types cannot be converted to binary types or bitmasks or enums. It generates an error in compile time.

All number types can be cast to strings by using their decimal representation. It may be desirable to handle details with precision of float types, though. I believe that the C family does some stuff to avoid printing too many decimals, but I don't know the rules. Using overloading, one could treat casts as overloaded methods, and provide extra information (e.g. number base and precision).

When integer types are cast to bool, zero (0) is false and everything else is true. Casting floating-point types may be tricky because there could be numbers expected to be zero but be slightly away from zero because of precision limits. I would suggest that floating-point conversion to bool is an error in compile time.

Two alternatives to the above would be: (1) considering 0.0 and -0.0 as false and everything else as true [risk of near-zeroes], and (2) considering everything in the range from -1 [exclusive] to +1 [exclusive] as false and everything else as true, because that's what would happen if we cast to integers and then to booleans (in C), but it may be however unexpected behavior at first glance.

From bool

When cast to signed or unsigned integers, true becomes 0, and false becomes 1.

When cast to string, true becomes "true", and false becomes "false".

Implicit casts

A first approach to defining which casts are implicitly permitted could be: implicit casts are only defined where no data loss occurs. They have the exact same behavior as explicit casts, so the choice of which conversions should be implicit do not affect the how conversions are done.

Implicit casts occur in assignments, binary operations, and procedure calls.

The above rule can probably be summarized into the following table. @SanderMertens please double-check if you think that'd be correct.

screen shot 2014-12-27 at 11 54 14

However, I have subjective thoughts against allowing anything to be implicitly passed as a string. Also, promoting bitmasks and enums to numbers implicitly doesn't look like expected behavior.

jleeothon avatar Dec 27 '14 17:12 jleeothon

The general rule of no data-loss is useful when translating between integers. The one thing I wouldn't do is having to explicitly cast everything to a bool. I'd like to add implicit casts to bool for int8, int16, int32, int64, uint8, uint16, uint32, uint64, octet, word, float32, float64 and string.

We should also consider if we allow implicit casting ints to floating points. I wouldn't have an issue with that, since this is likely to occur frequently. We can't guarantee that there won't be loss of precision, but the expressiveness of the data is larger. The reverse is not true. Implicit conversions from float to integers are not desirable, I think.

Casting booleans to numbers could be useful. We could define boolean-arithmetic like this: x * false == 0 x * true == x x / false == NaN x / true == x x + false == x x + true == x + 1 x - false == x x - true == x - 1

Casting implicitly to string seems safe to me, the reverse most certainly isn't. We could (should) however consider issuing an error when a user tries to do non-assigments, like 2 == "2". As a general rule of thumb, we could say that only in assignments (function-parameter passing is a special case of assignment) we allow for implicit conversions.

Also, numeric literals should always be implicitly casted to the most expressive typ in calculations. For example, 5 + 2.5 should yield 7.5, not a type error.

I can't think of a lot of use cases where you'd want to exchange enums and integers so I'm okay with making those explicit conversions.

I have mixed feelings on bit masks. I'd rather make everything an explicit cast, since using bit masks as a number doesn't seem like something that'd be used a lot. We would have to make sure that conversions to bool and calculations on bit masks work properly so that a user can do: if mask & SpareWheel

SanderMertens avatar Dec 30 '14 07:12 SanderMertens

"Everything can be implicitly cast to bool." I have doubts on whether this could cause confusion. For one part, I would believe it to be desirable in control statements (if/while/etc.), but arguable for procedure calls. I agree that a global strategy would be simpler, but I'd fear it to be error-prone. I'd be okay with it though, but I'd like to see more example usages.

"Implicit casting ints to floats." I think I missed a couple of those in the table above. I understand that you suggest including a cast from int64 to float32? I think that the loss of precision will be tolerable; the sake of uniformity in a rule may serve a greater good. In addition, these implicit casts should be defined for expressions, e.g. integer promotion in addition or subtraction, and integer promotion to float64 for divisions. Related to #87.

"Casting booleans to numbers (implicitly)." I have mixed feelings, but I see how it has desirable use cases. I would be okay with that decision.

"Contextual implicit casts from strings to numbers in assignments only." Is an interesting rule that I would agree with. It's related to my first point in this comment. It bloats the rules, though, but seems desirable.

On enums and integers, and bitmasks. Yes, I'd support explicit only conversions.

jleeothon avatar Dec 30 '14 16:12 jleeothon