nmatrix icon indicating copy to clipboard operation
nmatrix copied to clipboard

Change most void pointers to opaque struct pointers

Open translunar opened this issue 12 years ago • 8 comments

Right now when we write a generic function for some kind of dtype or itype data, the inputs and outputs must be stored in void*.

We should at some point switch to opaque structs. This gives us some degree of type safety, may help prevent future bugs.

translunar avatar Aug 20 '12 19:08 translunar

I would like to work on this one next...

Could you assign me to it please?

Could you also enlist a few pointers to get me started on it? That would be very helpful...

Any prerequisites required for this one? I'll start going through any resources you might point me to.

Thanks!

v0dro avatar Jun 07 '13 10:06 v0dro

So this is a great issue, and we may both want to think a little bit more about the implementation details. Here's another SO reference: http://stackoverflow.com/questions/3965279/opaque-c-structs-how-should-they-be-declared

In particular, I think the checked answer offers really good guidance.

Where should we apply this first? I'd say, let's start with the void* pointers that concern matrix data in nmatrix.h. For example, you have void* elements in DENSE_STORAGE, void* a and void* ija in YALE_STORAGE, void* val in NODE (part of the LIST type), and void* default_val in LIST_STORAGE.

default_val and val both point to single values (type given by the dtype), so these should be one kind of opaque struct. Then, ija points to C-style arrays of values (type given by itype), so these should be a second kind. And then the third kind of opaque struct would be for elements and a, which are dtype-value arrays.

The goal here, remember, is to prevent errors where we accidentally pass an a array to a function that needs an ija (this happens all the damn time during development). We also don't want to accidentally pass multiple values to a default_val argument, which only expects (and frees) a single value.

It's worth noting that, sometimes, an array OR a single value may be passed (as in the case of dense matrix creation, where either a default value or a number of default values may be provided). In that case, the argument should be given an opaque structure pointer as if it were a C array.

Before you implement it, I'd like to see a little discussion of how you plan to do it, if possible -- and maybe a little research on motivations for using opaque structure pointers. This is part of the iterative design work we do, and gives others an opportunity to provide input. =) It can take place here or in the Google Group.

Thanks so much for agreeing to work on this.

translunar avatar Jun 07 '13 17:06 translunar

I will immediately start my research on this issue, from the pointers that you have provided.

Currently I have a vague idea on opaque struct pointers, I will go through whatever literature I can go through and keep you in the loop about any developments.

Once I start going through the code base (which will mostly be next week), we can start discussing the design of the new code.

Will keep you in the loop about anything that I might come across.

Thank you for all the help!

v0dro avatar Jun 07 '13 17:06 v0dro

What I've found out about opaque types so far is that they are nothing but struct's which have been declared but not necessarily defined. You can change the underlying implementation of the struct without having to modify the programs that might be using it in its previous avatar.

As far as NMatrix is concerned, using opaque data types will have the obvious type-checking advantage, and if in the future we want to change the underlying implementation of the concerned data, that will also be possible without making significant changes to any code that has already been written.

Since the values that need replacement with opaque types are both, a continuous or a single value, I think it makes sense to declare the opaque structs without the trailing asterix, and keeping it that way for all opaque structs for uniformity. eg - typedef struct customDataType first

Wherever a pointer is needed, just use it as you would a normal int or float pointer.

v0dro avatar Jun 20 '13 07:06 v0dro

See in particular this comment about pointers: http://stackoverflow.com/questions/3965279/opaque-c-structs-how-should-they-be-declared#comment4239520_3965308

translunar avatar Jun 20 '13 18:06 translunar

I think the declarations for the types can go into types.h?

v0dro avatar Jun 25 '13 17:06 v0dro

Another question: What should the new structs be named?

What I think is, for scenarios where the data type indicates the actual data inside a matrix, for example void* elements in case of the DENSE storage type, we can have the same name, so as to avoid confusion.

In case of arrays that store data ABOUT a matrix (like IJA in Yale), I think we can have a separate data type for each of these, something which will talk about what is actually there inside the array, so that whatever one might wish to name the variable, the data type will remain the same, thus avoiding confusion.

v0dro avatar Jun 25 '13 17:06 v0dro

Try in types.h first, but I expect that they probably need to be very carefully put in nmatrix.h.

Caution: nmatrix.h is written in both C++ and C. Every declaration that exists in C++ needs to exist in parallel in C, unless the declaration is the same -- in which case it goes in a unified section. Please look over nmatrix.h carefully and make sure you understand if you do end up changing it -- and do feel free to ask questions. nmatrix.h is what gets exposed to the rest of the world for using NMatrix API, so it's critical that this file not change too much.

As for the names of the new structures, I think something that indicates that they're arrays of data versus arrays of indices (which is kind of what you're getting at). How about DARRAY and DVALUE for multiple and single values respectively; and IARRAY and IVALUE for multiple and single indices respectively? I like the idea of having these be capitalized because it's consistent with Ruby using VALUE for Ruby objects and ID for interns/symbols. I'm flexible on all of this! If you have a better idea, let's talk about it.

What do you mean exactly "separate data type for each of these?" Each of what? Can you give me examples, other than the ija array?

translunar avatar Jun 25 '13 17:06 translunar