effekt icon indicating copy to clipboard operation
effekt copied to clipboard

Coercions, Conversions, Casts

Open phischu opened this issue 2 months ago • 5 comments

In the current standard library there are number of definitions that take one type and return another one.

Step 1 is to survey all these definitions and to categorize them. For example, there are mere newtype coercions, numerical conversions, etc.

Step 2 is to define a style guide given the data from step 1. For example, shall we call it intoString or fromByteArrray?

Step 3 is to actually change the standard library to follow the style guide.

phischu avatar Oct 02 '25 07:10 phischu

One source of inconsistencies is ordering prelude modules, combined with the tension between UFCS and full name: For example, we have bytearray::fromString(str) usable as str.fromString. If we changed it to string::toByteArray(str) usable as str.toByteArray (moving the module it's defined in) then we might* get a stdlib cyclic dependency. Of course, we could keep this defn in bytearray, but then bytearray::toByteArray(str) looks a bit weird...

*might not be applicable for this exact example, but there are some cases of this in the stdlib

jiribenes avatar Oct 02 '25 07:10 jiribenes

I thought about this a bit more, and I agree, we should distinguish between these. Let me make a distinction here to start:

  • reinterpretations: asX, such as string.asByteArray, bytearray.asString, byte.asChar, double.asInt, guaranteed to be lossless, and ideally pretty cheap, never throwing
  • conversions: toX, such as int.toByte, array.toList, double.toInt, potentially (and often) lossy, potentially not efficient, can throw / return a more complex type

jiribenes avatar Dec 09 '25 21:12 jiribenes

Of course, we could keep this defn in bytearray, but then bytearray::toByteArray(str) looks a bit weird...

At least with the current system (and with the future one, if we allow a sufficiently useful form of reexports), we could define string::toByteArray in bytearray.effekt, though.

As one vote, I would prefer toX/asX over fromX, since then overloading is on the source type, which feels more intuitive/conventional. [EDIT/Minor addition: Also, fixing the return type manually is annoying in the REPL.]

marzipankaiser avatar Dec 10 '25 09:12 marzipankaiser

We also have bytearray::toString and char::toString and then also show for Char. We also have char::digitValue. Then there is record UnionFind(rawElements: ResizableArray[Int]) which is basically a newtype and UnionFind and rawElements are the witnesses.

I would distinguish between 1. the two halfs of an isomorphism, 2. injections (like byte.asChar) and 3. projections. I am confused about double.asInt, this is not lossless? I believe toInt is a nice example, since we have it on Char, Double, Byte, String, and String with base. These are totally different kinds of functions.

phischu avatar Dec 11 '25 21:12 phischu

I am confused about double.asInt, this is not lossless?

All of the asX mentioned above would essentially be what C++ calls reinterpret casts (or at least very close to them). In my mind, double.asInt would be "reinterpret the bits of this 64bit float as a 64bit integer" which is lossless, as compared to double.toInt which is not lossless (truncating its values the process).

EDIT: The purpose of mentioning both double.toInt and double.asInt was to show that there might be confusions when multiple different conversions are defined between two types. Therefore any similar convention either needs to be "intuitive" enough, or suffer from similar misunderstandings.

jiribenes avatar Dec 11 '25 21:12 jiribenes