Collections-C
Collections-C copied to clipboard
list_add does not copy data
When using list_add_all
(list1, list2) calling list_destroy_free
on both lists will cause a SIGSEGV. The solution, as I see it is that list_add_all
needs to do a deep copy of the data.
Looking at the list_add_all
, looks like it's calling list_add
, which in turn calls list_add_last
. That function creates a new Node
in the list but the data is not copied from input, just the pointers are equalized.
I suspect element size will need to be known somehow at copy time.
The data is not copied by design. You can think of list_add_all
as nothing more than list_add
that adds pointers in bulk. And since pointers are the data that the list directly holds, it makes no sense to do a deep copy because making deep copies would actually add something else to the list.
As for the potential double free
when calling list_destroy_free
, well, that would be an error on the side of the user and not the API. The containers do not take over the memory management of the data you put into them, nor can they do that. You still have to keep track of how you allocate and deallocate your data. list_destroy_free
is just a convenience function that you should only use when you know that it's safe to use. Unfortunately there is no way to guarantee that list_destroy_free
will never explode, even if the data was copied, as you suggested, because there are so many more ways to end up with dangling pointers inside the list. The only solution to this is to be extra careful when managing you data, or alternatively, to use a language that does automatic memory management. :-)
I do think this is an API consistency issue.
list_add_all
takes 2 list and after the call the user expects to have 2 lists, rather than 1 list and a list of references to the same data.
If list_add_all
is intended to bulk add data into a list, perhaps the parameters shouldn't be 2 lists, but rather a list and an array of data ( not nodes ) ? After the call to list_add_all
the user still has 2 lists and the List
API should be functioning correctly for both structures.
Thinking more about this, a better solution ( instead of making copies of the data ) would be to add reference count for the data pointers, so they are not deleted if the refcnt is more than 1. This will add an extra sizeof(size_t)
bytes per element for each list, but will not require major refactoring of the API.
I understand this is C, but shouldn't be the library responsibility to provide the necessary guardrails? Otherwise, all the users will implement something similar. Having this as part of the library will reduce the boilerplate necessary, which I suppose is the whole purpose of the library in the first place.