checkedc
checkedc copied to clipboard
Field access to structs with bounded and NT char arrays
I have code like this:
struct mystr {
_Array_ptr<char> buf : count(len);
unsigned int len;
};
int getlen(_Ptr<struct mystr> p_str) { return p_str->len; }
_Nt_array_ptr<char> getbuf(_Ptr<struct mystr> p_str) { return p_str->buf; }
The invariant in this code is that while the len field bounds the ultimate length of buf, we are sure there is a NUL terminator in there somewhere. That's why it's OK to return the buffer as a NT array pointer.
A couple of questions:
- This code doesn't actually work: the compiler complains when I try to return the array pointer as a NT array pointer. How can I convince it that the thing is actually NUL terminated?
- How do I associate the length returned by the first function with the buffer returned by the second? Even if the second returned a normal array and not a NT array, how can I make it return something with the right length?
I'm OK to refactor this API, but I'm wondering what the best approach is.
To answer 2, function return bounds are specificed in terms of your parameters. I think that you can do the following for getbuf(…):
_Nt_array_ptr<char> getbuf(_Ptr<struct mystr> p_str) : count(p_str->len) { return p_str->buf; }
I don't have an answer for 1. Is it the case that the buffer is always len, so even accessing after the NUL is fine, as long as you're within len? In that case, attempt to make the member of the struct and the return from getbuf a nt_array_ptr. I realise this is perhaps not semantically correct. I have 20% confidence that this is an acceptable answer, but it's a suggestion.
Yes, the correct answer for question 1 is to make it an nt_array_ptr with bounds of count(len-1). This forces you to put a NUL at the end of the buffer, no matter what. It's OK if there is another NUL in the buffer before the end. By putting NUL at the very end, we're guaranteeing that even if an earlier NUL is overwritten, code won't sail off the end of the buffer.
Another way to think about this: if as much space in the buffer is used as possible and the buffer really does always have to be null-terminated, then there must be a NUL at exactly the end.
There is one tricky part about this approach. Any code using the buffer had better not try to write to buf[len - 1]. That would be an out-of-bounds write.
To answer Sam's question about how a dynamically-allocated null-terminated array is created and initialized, the answer is that you can't currently write it in checked code. We need to have a way to safely convert an array_ptr to an nt_array_ptr. That would require us knowing that the array_ptr is the only pointer to the array at the time of conversion and that the array_ptr-typed value is no longer usable after the conversion. If this sounds a lot like C++'s unique_ptr, that's because it is.
I think we're going to need a framework for discussing aliasing and anti-aliasing relationships for C code, which is why I've held off on suggesting adding unique_ptr as a specific concept. I'd rather this be something that can be expressed in that framework, rather than a type for a special case.
So far I have skirted the initialization issue because all strings in my porting project are allocated in unchecked files. I'm nearly to the point where I can attempt to make these files checked. I guess I will just have to put the initialization part in unchecked code.
(Didn't mean to close it.) One way to allocate and initialize a _Nt_array_ptr<...> is to use calloc() rather than malloc(). Perhaps we can/should do this in the short term? I.e., change the compiler to allow such initialization in checked code?
That is a good point, but we don't have a way yet of signalling both the new type and the bounds in an interoperation type. This is a bug.
At the moment, char* foo : count(3) can only mean that the checked type is array_ptr<char> foo : count(3). It is on our list to have a way of denoting that this could instead become nt_array_ptr<char> foo : count(3) but a) I can't find the issue, and b) it's not designed or implemented yet.