dynarray icon indicating copy to clipboard operation
dynarray copied to clipboard

Bounds checking

Open eignnx opened this issue 6 months ago • 6 comments

Comments in the code suggest that _dynarray_{get,set} do bounds checking. In fact they do not, I must have forgot to add that.

Bounds checking should be added.

eignnx avatar May 07 '25 00:05 eignnx

What should dynarray{get,set} do in the case of a out of bounds access? Should it print an error message, return an error code, do nothing, or something else?

BenjaminYankowitz avatar May 09 '25 00:05 BenjaminYankowitz

Hmm good question.

I think there's a couple approaches:

  1. Return a boolean (false when out of bounds) and (in the case of get) give the value output as an out parameter.
  2. Return a struct like struct { bool ok; void *value }and set the ok field false when out of bounds
  3. Return a void* which is null on out of bounds access (although for an in-bounds access with set I'm not sure what void* value to return)

Which do you think is best? I think the only wrong answer is to call exit or perror since that makes the error basically unrecoverable.

eignnx avatar May 09 '25 01:05 eignnx

Personally I prefer option 1, the other two would require an allocation for the data. The three ways I can see to do this are

  1. Global state (which might be problematic if someone wants to this use with multiple threads).
  2. Allocating memory every time (which would both be slow, and easy for the user to forget to free causing memory leaks)
  3. Adding another element to the header where we could place the data. (which would cause a increase in memory usage in the case that we have a lot of small arrays).

Additionally we for the first and third method I gave there is an issue if you call dynarray_get again beforeading out the value from the previous time.

Finally I think

int value;
if(dynarray_get(arr,index,value))
{
 // use value
} else {
 // deal with error
}

is nicer than

int *ptr = dynarray_get(arr,index);
if(ptr)
{
 int value = *ptr;
 // use value
} else {
 // deal with error
}

or

dynarray_ret_type ret_given = dynarray_get(arr,index);
if(ret_given.success)
{
 int value = *ret_given.ptr;
 // use value
} else {
 // deal with error
}

BenjaminYankowitz avatar May 09 '25 02:05 BenjaminYankowitz

I'm not sure what you mean by allocation being required, couldn't the implementation essentially return &arr[idx] (after computing idx based on stride and the provided index)?

I agree that if(ret_given.success) is the clunkiest of the three, maybe we should eliminate my 2nd suggested approach.

I don't mind your 2nd code example, I feel like it's fairly idiomatic C to write something along the lines of

int *ptr;

if ((ptr = dynarray_get(arr, index))) {
    // use *ptr
} else {
    // deal with error
}

although that's arguably less clean than

if (dynarray_get(arr, index, &value)) { /* ... */ }

However it may be trickier to remember argument order for a 3-parameter function than for a 2-parameter one.

Here's a compiler explorer example with both ideas, I was mainly checking to make sure they didn't generate compiler warnings.

What do you think, are there cons we haven't explored with the assignment-in-conditional approach? I'd be fine with either of these two options, I just want to be sure we've provided justification for all our choices.

And what was your idea with allocation? Would you like to sketch out what that might look like?

eignnx avatar May 09 '25 05:05 eignnx

I'm not sure why I did not think of that, for some reason the idea that the data has to be copied was stuck in my head. Allocation (at least in the way I was thinking of it is not going to give any benefits). I think you are right about 2 being idiomatic C. Also if you return the address to the value in memory, that would allow you to then write to that address afterwords.

BenjaminYankowitz avatar May 09 '25 16:05 BenjaminYankowitz

That's a great point about being able to write via the pointer. I wonder if instead there should be just one function called something like dynarray_access (maybe there's a better name) which could do bounds checking and return a pointer to the element to be used for reading or writing.

The user would have to make sure they don't hold on to references to elements after calling dynarray_push since that may realloc and invalidate the buffer. An advantage to the 3-parameter approach is that it immediately copies the element out which leaves no pointer to be misused.

eignnx avatar May 09 '25 17:05 eignnx