solidity icon indicating copy to clipboard operation
solidity copied to clipboard

Enums with data / sum datatype / algebraic datatypes

Open chriseth opened this issue 9 years ago • 14 comments
trafficstars

This is a story that already exists is pivotal tracker.

Solidity should support sum datatype similar to how they are supported by rust.

// Enum holding additional data (marked union / sum datatype)
enum Commitment {
    Hidden(bytes32 hash),
    Revealed(uint value)
}

function reveal(Commitment storage self, uint _value, uint _nonce) {
    if (self == Commitment.Hidden && self.Hidden.hash == sha3(_value, _nonce))
        self = Commitment.Revealed(_value);
}

Comparison between enum types and enum field names (like Commitment.Hidden) is done based on the enum value only (data is ignored) but comparisons between enum variables is a deep comparison.

Because of the fact that references to the storage location of self.Hidden.hash can be kept, while self is switched to Revealed, the storage and memory locations of Commitment.Hidden.hash and Commitment.Revealed.value cannot overlap.

chriseth avatar Aug 16 '16 14:08 chriseth

  • I'm inclined for this kind of syntax:
function reveal(Commitment storage self, uint _value, uint _nonce) {
    case (self) {
        Commitment.Hidden(byte32 h) {
            if (h == sha3(_value, _nonce)) {
            }
        }
        Commitment.Revealed(uint _) {
            throw;
        }
    }
}
  • I am kind of against the "ignoring the data" behavior of == operator. We lose a nice property if (x == y) { x = y; } is always a no-op (except for gas).

  • Further, I would introduce a case-expression like

hash content =
    case (self) {
        Commitment.Hidden(byte32 h) {
           h
        }
        Commitment.Revealed(uint _) {
            throw;
        }
    };
  • This has a nice algebraic property:
case(Commitment.Hidden(0xaabbcc)) {
   Commitment.Hidden(byte32 h) {
      h
   }
}

can be rewritten into just h. The constructor of the algebraic datatype cancels with the destructor.

pirapira avatar Jun 09 '17 16:06 pirapira

I agree that we need some kind of match expressions. Concerning if (x == y) {x = y;}: It is only a no-op if the common type of x and y is that of x, but yes, the comparison proposed in the description might be too misleading. If we can get same same functionality using match expressions with not too much more code, I'm fine with dropping it.

chriseth avatar Jun 12 '17 09:06 chriseth

@chriseth what about this instead of the not so equal equality?

function reveal(Commitment storage self, uint _value, uint _nonce) {
    if (self.Hidden && self.Hidden.hash == sha3(_value, _nonce))
        self = Commitment.Revealed(_value);
}

pirapira avatar Jun 12 '17 13:06 pirapira

Hm, that looks like a good solution!

chriseth avatar Jun 12 '17 14:06 chriseth

Original one in Pivotaltracker: https://www.pivotaltracker.com/story/show/94901930

Implement enums similar to how they are used in Rust:

enum Node { Inner(uint left, uint right), Leaf(bytes32 label) }
mapping(uint => Node) nodes;
function walk(uint nodeId) {
  // Comparison to bare enum is done only on the enum value itself and not the data
  if (nodes[nodeId] == Node.Inner) {
    // "Inner" is needed to disambiguate
    walk(nodes[nodeId].Inner.left);
    // Accessing ".Inner" for a non-inner node results in an exception
    walk(nodes[nodeId].Inner.right);
  } else {
    report(nodes[nodeId].Leaf.label);
  }
}
function append(uint nodeId, bytes32 label) {
  nodes[nodeId] = Node.Leaf(label);
}

This saves storage space as the data for the different variants is stored at shared places (i.e. it is a type-safe union). If only 31 bytes are used for all variants, this still fits into a single storage slot together with the enum value. If reference types are used as data, they cannot be stored in-place, because someone could store a reference and change the (enum-)value of the variable, which would result in a type error. Instead, the reference type has to be stored at e.g. sha3(enum_value, position).

Destructuring assignment and match expressions are not part of this story.

Problems to overcome: Right-hand-side context in an assignment is not different from context in an equality comparison, but maybe we can work around this using binaryOperatorResult to force conversion to non-reference and stripped-down value before comparison. Note that delete on a data-enum has to either check the variant or erase all possible values which might include variably-sized arrays. The same has to be done when an enum is overwritten.

axic avatar Oct 06 '17 12:10 axic

This would be needed for a decentralized typing system like dType. If dType were to support the choice operator itself, it would have significant overhead (storage, complexity..).

But other projects might also benefit. A usecase example would be:

PostalAddress:: StreetAddress * UnitAddress * PostalCode * City * Country
GeoLocation:: Latitude * Longitude
Location:: PostalAddress + GeoLocation

loredanacirstea avatar Sep 04 '19 12:09 loredanacirstea

Putting this in 0.7.0 so we can talk about it.

chriseth avatar Sep 12 '19 08:09 chriseth

Question: should only value-types be supported or all types, including reference types, so arrays and structs? For the latter we have the problem of "dangling references".

ekpyron avatar Jan 15 '20 15:01 ekpyron

[assigning myself for creating a more detailed proposal]

ekpyron avatar Jan 15 '20 15:01 ekpyron

Here we are missing a lot information:

  • Use cases / requirements
  • Syntax for declaration
  • Syntax for usage (switch statement proposed above, match statement from Rust, etc.)
  • Encoding

It would be possible to consider this with different levels of complexity: a) Single value, value types only b) Single value, complex types only c) Multiple values, value types only d) Multiple values, complex types only

While a) is the simplest, it may not be good enough for most use cases, and likely b) is a "poor man's version" of c) or d).

Another important aspect to note is that would we start with a) and then implement b), there is a risk that the encoding would change in a backwards incompatible way, or it would produce awkward rules to keep backwards compatibility. Because of this @ekpyron's suggestion is to first consider the least restrictive option and think how that would be encoded in memory/storage, and perhaps restrict the implementation once we are clear how to encode it.

We should also collect more feedback on use cases, before settling on any of these.

axic avatar Aug 12 '20 13:08 axic

Here's my proposal for how this would work. I'm still working on the encoding part so here's the part regarding syntax and semantics so that I can get some feedback.

EDIT: This ended up being pretty long so I moved my proposal to a gist: Enums with data in Solidity

Here's the specific revision of the gist that was originally in this comment.

cameel avatar Oct 15 '20 20:10 cameel

EDIT: This comment originally had the encoding description. Now it's in the same gist as syntax and semantics. The original text is in this specific revision.

cameel avatar Oct 16 '20 20:10 cameel

This issue looks quite stalled. Is there any change of this being implemented?

Pzixel avatar Jul 15 '22 14:07 Pzixel

We're not actively working on it at the moment but it's one of the features we consider important and it's on our roadmap so we're going to revisit it sooner or later.

cameel avatar Jul 15 '22 15:07 cameel