chapel icon indicating copy to clipboard operation
chapel copied to clipboard

Can we optimize away an unused super class?

Open jabraham17 opened this issue 4 months ago • 11 comments

I came to the unfortunate realization while working on the binarytrees benchmark that no matter what all Chapel classes have as their first element a "super" field. This means that no matter what the code is we are always paying the cost of having that field there.

For example, this linked list node is always going to be 16 bytes.

class Node {
  var next: unmanaged Node?;
}

This is because all Chapel classes "pay the cost" of a super class, even without using those features. The "Node" class is codegened approximately as:

struct Root {
 int32_t cid; // 4 bytes
}; // 4 bytes
struct Node {
  struct Root super;  // 4 bytes
                                 // 4 bytes of padding for alignment
  struct Node* next; // 8 bytes (on 64-bit systems)
}; // 16 bytes

Why do we have to pay the cost of having this super field? I think today we already don't pay the time complexity cost of having this field, but we do pay the memory complexity cost.

Note that I think we don't pay the time complexity cost (at least for the LLVM backend, not sure about C but I assume/hope its true as well) because the backend is optimizing it away. The code as dumped by the Chapel compiler allocates the class and initializes the super class by setting the constant class id to the "cid" field. Then the backend is smart enough to see that the "cid" is never used and optimizes it away (yay DSE). But semantically the backend cannot change the memory layout of the types, so we just have empty uninitialized memory that is never used.

Can we instead write a compiler optimization that removes the "super" field/ "cid" field (I am lazily assuming they are interchangable here) that checks if they are never used, and in that case does not codegen that part of the class?

I believe this should be semantically valid, as AFAIK we make no guarantees to the user about the memory layout of classes.

I think this can be achieved by just checking if we ever call "getcid" or "testcid" on a given type, although this is perhaps a naive simplification.

jabraham17 avatar Oct 08 '24 00:10 jabraham17