adiar icon indicating copy to clipboard operation
adiar copied to clipboard

Apply Second Reduction Rule inside of Builder

Open SSoelvsten opened this issue 2 years ago • 2 comments

Right now, the bdd_builder and zdd_builder do not account for the user generating the same node twice.

Implement a Hash Table

  • [ ] Implement a levelized_hash_table<ElemType, Hash = std::hash> template that internally stores values in a tpie::array. It should support the following functions
    • [ ] insert(ElemType e, size_t hash_value) Insert e based on the precomputed hash hash_value for e. Note, if we find a node from a previous level, then we can safely overwrite it! At this point, we know we have reached the end of this bucket (for the current level).
    • [ ] insert(elem_t e) Compute Hash(e) and use it to place it in the table with the above function.
    • [ ] find(elem_t e, size_t hash_value) / find(elem_t e) Finds the index where e resides (-1 if it does not).
    • [ ] size() Number of entries, which should be reset to 1 when inserting nodes for a new level.
    • [ ] current_level() Returns the level of the latest inserted element. Returns -1 if nothing has been pushed.
    • [ ] capacity() Return the size of the number of elements that fit into the internal tpie::array.

Alternatively, this can be done by reusing the (leaky) computation cache from #444 . But, I would favour the above computation cache as it is levelized.

Hashing Nodes

  • [ ] Create a hash value of each node (Maybe drawing inspiration from the hash function in #434 ). This should be a separate function that can be parsed as a template parameter similar to std::hash.

Using the Hash Table

  • [ ] We can ask the user for a cache-size (which is probably somehow related to the max-width) of what they want to create. This sets the size of the hash-table.
    • [ ] Make builder_ptr also carry its hash value (and its negated hash value, if complement edges are supported) as part of its shared state. That is, the boolean in shared_ptr<bool> unreferenced should be generalised to a struct builder_ptr_shared that has the unreferenced boolean, the uid and its hash values. Alternatively, it might be prettier to just swap the order between the shared_ptr and the builder_ptr, i.e. the end user gets shared_ptr<builder_ptr> (renamed to shared_ptr<builder_node> which is aliased as builder_ptr)
      • [ ] It might also be of benefit to replace the shared_ptr<...> with the one in #175 .
    • [ ] Add a hash table inside of the builder template with shared_ptr<builder_ptr_shared> elements.
      • [ ] When the user adds a new node, compute the hash and look up, whether it was already created earlier. If the user generates the same node twice, then return a builder_ptr using the given shared state.
      • [ ] When doing a lookup in the hash-table, then if one visits a node, that the user has no references to anymore (check the reference count of the shared_ptr) then it can be deleted.

If the cache-size is set to 0, then it works just as it always did. I'm not sure of yet, whether this or the maximum possible should be the default value.

SSoelvsten avatar Nov 15 '22 10:11 SSoelvsten

In many ways, this makes the builder on-par with the streaming BDD algorithms of Shin-ichi Minato and Shinya Ishihara. So, it is probably not that icky an idea. Yes, we cannot provide a guarantee, that we truly can take care of the second reduction rule, but depending on the available memory, we can do as well as is possible.

Furthermore, since all of the builder_ptr values are stored inside of the builder_ptr_shared struct, we can in fact use the destructor to remove the entry from the builder.

SSoelvsten avatar Nov 16 '22 09:11 SSoelvsten

Question is: what should the default size be?

  1. 0, i.e. no cache (same behaviour as now)
  2. As large as possible?
  3. Include array doubling and grow as needed?

SSoelvsten avatar Nov 22 '22 20:11 SSoelvsten