AtomVM icon indicating copy to clipboard operation
AtomVM copied to clipboard

Support duplicate bag in insert and lookup

Open jgonet opened this issue 7 months ago • 1 comments

~Merge after #1614. New code after "ets hash table: rename status enum".~

These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

jgonet avatar Jun 17 '25 14:06 jgonet

@bettio I discovered a problem with ets:insert(T, list) – if we have a list of [a, b, a, c], the second a wins the write, dropping the data. I think I'll split this PR and rework the hashtable internals to have two levels of mapping. One for having just key, one for entry:

struct HNode
{
    struct HNode *next;
    term key; // owned by first entry
    struct Entry *entries;
};

struct Entry {
    struct Entry* next;
    term key; // key from tuple
    term item; // a tuple
    Heap heap;
};
image

This way I avoid problems with list allocation – current approach is to have key -> list of tuples mapping. This is annoying when you discover you need to add another item to list because you need to allocate new heap, copy terms, etc.

Ets insertion would create a new empty hash table, insert tuples there, and pass HNodes from it to original hashtable. Single item would create single HNode with single Entry.

If you have any comments on how to simplify this design, please share.

jgonet avatar Aug 19 '25 23:08 jgonet