nmatrix
nmatrix copied to clipboard
Change most void pointers to opaque struct pointers
Right now when we write a generic function for some kind of dtype or itype data, the inputs and outputs must be stored in void*
.
We should at some point switch to opaque structs. This gives us some degree of type safety, may help prevent future bugs.
I would like to work on this one next...
Could you assign me to it please?
Could you also enlist a few pointers to get me started on it? That would be very helpful...
Any prerequisites required for this one? I'll start going through any resources you might point me to.
Thanks!
So this is a great issue, and we may both want to think a little bit more about the implementation details. Here's another SO reference: http://stackoverflow.com/questions/3965279/opaque-c-structs-how-should-they-be-declared
In particular, I think the checked answer offers really good guidance.
Where should we apply this first? I'd say, let's start with the void*
pointers that concern matrix data in nmatrix.h
. For example, you have void* elements
in DENSE_STORAGE
, void* a
and void* ija
in YALE_STORAGE
, void* val
in NODE
(part of the LIST
type), and void* default_val
in LIST_STORAGE
.
default_val
and val
both point to single values (type given by the dtype), so these should be one kind of opaque struct. Then, ija
points to C-style arrays of values (type given by itype), so these should be a second kind. And then the third kind of opaque struct would be for elements
and a
, which are dtype-value arrays.
The goal here, remember, is to prevent errors where we accidentally pass an a
array to a function that needs an ija
(this happens all the damn time during development). We also don't want to accidentally pass multiple values to a default_val
argument, which only expects (and frees) a single value.
It's worth noting that, sometimes, an array OR a single value may be passed (as in the case of dense matrix creation, where either a default value or a number of default values may be provided). In that case, the argument should be given an opaque structure pointer as if it were a C array.
Before you implement it, I'd like to see a little discussion of how you plan to do it, if possible -- and maybe a little research on motivations for using opaque structure pointers. This is part of the iterative design work we do, and gives others an opportunity to provide input. =) It can take place here or in the Google Group.
Thanks so much for agreeing to work on this.
I will immediately start my research on this issue, from the pointers that you have provided.
Currently I have a vague idea on opaque struct pointers, I will go through whatever literature I can go through and keep you in the loop about any developments.
Once I start going through the code base (which will mostly be next week), we can start discussing the design of the new code.
Will keep you in the loop about anything that I might come across.
Thank you for all the help!
What I've found out about opaque types so far is that they are nothing but struct
's which have been declared but not necessarily defined. You can change the underlying implementation of the struct
without having to modify the programs that might be using it in its previous avatar.
As far as NMatrix is concerned, using opaque data types will have the obvious type-checking advantage, and if in the future we want to change the underlying implementation of the concerned data, that will also be possible without making significant changes to any code that has already been written.
Since the values that need replacement with opaque types are both, a continuous or a single value, I think it makes sense to declare the opaque structs without the trailing asterix, and keeping it that way for all opaque structs for uniformity.
eg - typedef struct customDataType first
Wherever a pointer is needed, just use it as you would a normal int
or float
pointer.
See in particular this comment about pointers: http://stackoverflow.com/questions/3965279/opaque-c-structs-how-should-they-be-declared#comment4239520_3965308
I think the declarations for the types can go into types.h
?
Another question: What should the new structs be named?
What I think is, for scenarios where the data type indicates the actual data inside a matrix, for example void* elements
in case of the DENSE
storage type, we can have the same name, so as to avoid confusion.
In case of arrays that store data ABOUT a matrix (like IJA in Yale), I think we can have a separate data type for each of these, something which will talk about what is actually there inside the array, so that whatever one might wish to name the variable, the data type will remain the same, thus avoiding confusion.
Try in types.h
first, but I expect that they probably need to be very carefully put in nmatrix.h.
Caution: nmatrix.h
is written in both C++ and C. Every declaration that exists in C++ needs to exist in parallel in C, unless the declaration is the same -- in which case it goes in a unified section. Please look over nmatrix.h
carefully and make sure you understand if you do end up changing it -- and do feel free to ask questions. nmatrix.h
is what gets exposed to the rest of the world for using NMatrix API, so it's critical that this file not change too much.
As for the names of the new structures, I think something that indicates that they're arrays of data versus arrays of indices (which is kind of what you're getting at). How about DARRAY
and DVALUE
for multiple and single values respectively; and IARRAY
and IVALUE
for multiple and single indices respectively? I like the idea of having these be capitalized because it's consistent with Ruby using VALUE
for Ruby objects and ID
for interns/symbols. I'm flexible on all of this! If you have a better idea, let's talk about it.
What do you mean exactly "separate data type for each of these?" Each of what? Can you give me examples, other than the ija
array?