swig icon indicating copy to clipboard operation
swig copied to clipboard

Move subsref and subsasgn from SwigRef to derived classes

Open jaeandersson opened this issue 9 years ago • 20 comments

The functions subsref and subsasgn can moved from the SwigRef base class to the classes inheriting directly from SwigRef. That way, they can be created only when needed, meaning better encapsulation.

It will also render SwigRef more minimalistic and hence more maintainable:

classdef SwigRef < handle
  properties 
    swigPtr
  end
  methods
    function disp(self)
      disp(sprintf('<Swig object, ptr=%d>',self.swigPtr))
    end
  end
end

A simpler SwigRef will avoid conflicts when a user has several projects, each project potentially build with a different version of Swig.

jaeandersson avatar Sep 09 '15 06:09 jaeandersson

Based on naive profiling of some code, I also found that subsref in SwigRef handling was quite expensive (10% of the script time) even for classes that did not implement it, so this change could also improve performance.

traversaro avatar Sep 09 '15 07:09 traversaro

OK, yet another reason to move it to the derived classes.

jaeandersson avatar Sep 09 '15 07:09 jaeandersson

I think it's a good idea. but there is then no reason not to do it for disp as well, leaving an empty SwigRef? question then starts to become if we actually need it (I've clearly lost track of how it all works...)

KrisThielemans avatar Sep 12 '15 20:09 KrisThielemans

I think it's a good idea. but there is then no reason not to do it for disp as well, leaving an empty SwigRef? question then starts to become if we actually need it (I've clearly lost track of how it all works...)

We definitely need the class. SwigRef's original purpose is to resolve the "diamond problem". Multiple inheritance won't work if the swigPtr property is defined in more than one place.

Whether to keep the disp, I don't see why not. It prints something useful and you can overload it to do whatever you want. Would you prefer that it throws an error?

jaeandersson avatar Sep 12 '15 20:09 jaeandersson

Removing disp is not what I meant. And if we need SwigRef, disp should be in SwigRef. From an object oriented point of view, you want to have common functions as high up as you can.

I agree that the current way of adding subsref and subasgn to SwigRef isn't ideal. For instance, it leads to cryptic errors about missing 'paren' when using an object in MATLAB with object(1) when there was no operator() defined. Moving the subsref up would be a solution for that. One thing I like though about having them in a .m file, is that it's easier for the user to see what's going on. I suppose you could have a SwigRefWithSubsRef.m...

I have some questions on subsref but will create a new issue for that.

KrisThielemans avatar Sep 13 '15 08:09 KrisThielemans

The main issue for removing the subsref and subasgn is to do it in a way so that they are automatically added when and where they are needed. Are there any working unit tests that rely on them? If not, I'm totally fine with simply removing them from SwigRef.m and enabling them for a proxy class only with a user-set feature.

jaeandersson avatar Sep 13 '15 09:09 jaeandersson

I think that the fundamental problem is that the the subsref/subsagn mechanism is actually a higher level of abstraction compared to C++'s operator() and operator[]. You can implement a feature that works like subsref/subsagn in C++ [1] but you can't really go the other way around and implement the functionality of operator() and operator[] using subsref and subsagn.

[1] See e.g. the SubMatrix and NonZeros classes in CasADi: MyMatrixType::operator(const MyIndexType1& ind1, const MyIndexType2& ind2) returns a SubMatrix<MyMatrixType, MyIndexType1, MyIndexType2> instance which keeps a reference to the MyMatrixType instance and overloads operator=.

jaeandersson avatar Sep 13 '15 09:09 jaeandersson

yes, I agree that this auto-detection is really quite hard. I vote for letting the user add them on as-needed basis. I unfortunately don't know how to use %feature though and so how easy this would be to implement.

I can't quite remember if there are test-cases that rely on this, aside from of course the std containers. I doubt that there are others, and if they do exist, it's because they've been copied from Octave like that, so we can modify them.

I was creating a new issue with some reasons why having this default is not a good thing, as I thought I had to convince you, so I'm pasting this here now below. you can probably just skip to the end.

Current situation: matlabopers.swg renames operator() to paren and operator[] to brace, and SwigRef.subsref then calls paren when it's called with () and a single argument.

This is convenient but to my feeling imposes too many things on the user. We've discussed this a bit before on http://permalink.gmane.org/gmane.comp.programming.swig.devel/23295. I'll list a few arguments below

  • In my own STIR code, I don't want my operator[] to be called at all from the target language as it doesn't do range checking and hence can lead to MATLAB segfaults. Instead, I call at(). ok, I guess can solve that by adding an %ignore operator[] and %rename(paren) *::at to my interface file. (I haven't checked yet) but having to undo a default is something that most users will not know about.
  • This magic falls down when you actually do want to be able to use a 2-argument version of (). Suddenly, the user has to overload subsref (in a non-obvious way, certainly when subsref is going to be moved to the wrapper). (I'm not sure why the current subsref does this forwarding only when it has only 1 argument.)
  • The choice of using () for operator() and {} for operator[] could be unnatural for the module (indeed, the std containers already use () for the equivalent of operator[], as they should of course). so, the user should have this under control.
  • we could have name clashes with user's paren() etc functions
  • swig-Python doesn't rename anything to getitem by default. it's left to the user to do this (even though it is somewhat ugly)

Having said all this, I don't have a good solution for this overloading due to the way that MATLAB implements () and {}. At least in Python, you know you should just extend your class with a getitem. In MATLAB, letting the swig-user overload subsref is really quite a difficult/dangerous thing to do (as it shouldn't interfere with calling the wrapper, not expose private stuff (which currently it seems to), etc).

Maybe this could be done with %feature? Is there a way to let the user specify "this is the name of the function that I want subsref to call when using () in MATLAB"? and if that would then happen to be usable with a getitem,setitem function that was already implemented for Python, well, that would be nice. (@jaeandersson, you mentioned this recently somewhere, but I cannot find it back)

KrisThielemans avatar Sep 13 '15 09:09 KrisThielemans

Actually, I forgot something very important. The subsref/subsasgn functions are necessary when you have member variables. Because they redirect a call like:

a = MyWrappedClass(...)
a.foo = 4;

to

a = MyWrappedClass(...)
a.foo(4);

If you don't overload subsasgn you can't set struct/class members like this.

jaeandersson avatar Sep 13 '15 09:09 jaeandersson

is this solvable by adding set/get properties? http://uk.mathworks.com/help/matlab/matlab_oop/property-access-methods.html

KrisThielemans avatar Sep 13 '15 10:09 KrisThielemans

Didn't know about those... Since there is no corresponding property in the Matlab proxy, I'm not sure if it will work. Let me test...

jaeandersson avatar Sep 13 '15 10:09 jaeandersson

Nope, not possible:

>> a = testgetset()
Error using testgetset
Error: File: /Users/jaeandersson/tmp/testgetset.m Line: 1 Column: 10
Cannot specify a set function for property 'myprop' in class 'testgetset', because that property is not defined by that class.

The problem is that the property is stored in the wrapped class, not in the proxy class.

jaeandersson avatar Sep 13 '15 10:09 jaeandersson

sigh. you could create a dummy private property for every member you want to access, but that seems like a horrible solution (with memory penalty).

for handle classes (which is what we have), there's http://uk.mathworks.com/help/matlab/matlab_oop/implementing-a-set-get-interface-for-properties.html, but that interface is far too ugly of course.

seems like a question for the MATLAB newsgroup...

KrisThielemans avatar Sep 13 '15 10:09 KrisThielemans

Yes, I don't like the idea of dummy properties. In addition to the memory overhead issues, it would also cause problems if you want to do multiple inheritance of two classes that share some data member.

Anyway, I think the solution is pretty obvious: We have to continue to overload subsref/subsasgn. We just need to think of how to do it in a maintainable way.

jaeandersson avatar Sep 13 '15 10:09 jaeandersson

Actually, the dummy properties can be left untouched at all times. Have a look at this:

classdef testgetset < handle
  properties
    myprop
  end
  properties (Access=private, Hidden=true)
    myprop_internal
  end

  methods
    function val = get.myprop(obj)
      disp(sprintf('getting myprop_internal, dummy isempty(obj.myprop) = %d', isempty(obj.myprop)))
      val = obj.myprop_internal;
    end
    function obj = set.myprop(obj, val)
      disp(sprintf('setting myprop_internal to %d, dummy isempty(obj.myprop) = %d', val, isempty(obj.myprop)))
      obj.myprop_internal = val;
    end
  end
end

Here is usage:

>> a = testgetset()

a =

getting myprop_internal, dummy isempty(obj.myprop) = 1
  testgetset with properties:

    myprop: []

>> a.myprop = 10;
getting myprop_internal, dummy isempty(obj.myprop) = 1
setting myprop_internal to 10, dummy isempty(obj.myprop) = 1
>> v = a.myprop
getting myprop_internal, dummy isempty(obj.myprop) = 1

v =

    10

Note that the myprop property actually stays empty all the time...

jaeandersson avatar Sep 13 '15 10:09 jaeandersson

So a possibility would be to refactor the handling of member variables and member constants. We could also make the swigPtr hidden and remove disp since the default printing routine is actually pretty nice too.

jaeandersson avatar Sep 13 '15 11:09 jaeandersson

interesting. no idea how much overhead this would create then (presumably somewhere inside, there will still be a pointer to an empty object). I feel that before doing such a major overhaul, we need to be sure it's the correct path to take. shall we write something on the MATLAB newsgroup?

KrisThielemans avatar Sep 13 '15 11:09 KrisThielemans

I have no experience with the MATLAB newsgroup. I have only used MATLAB Answers. Anyway, not sure what we should ask them. Show our proposed solution and ask if there is another way of doing the same thing? Regardless, should probably write there and announce our project somehow, might be people interested in contributing.

I would not worry about the overhead at this point. I cannot imagine that this will create significant overhead unless you are wrapping structs with a huge number of members. Here is some discussion on the topic: http://blogs.mathworks.com/loren/2012/03/26/considering-performance-in-object-oriented-matlab-code/

In general, if we want to improve performance, we can just implement subsref and subsasgn in MEX instead.

jaeandersson avatar Sep 13 '15 12:09 jaeandersson

So, just to clarify, the the property approach would not work if you try to wrap something like this:

struct Foo {
  int prop;
};

struct Bar {
  int prop;
};

struct Baz : public Foo, public Bar {
};

Trying to do something like that would be very bad coding anyway IMO...

What you can do is to have a property in a common base class:

struct PropBase {
  int prop;
};

struct Foo : public PropBase {
};

struct Bar : public PropBase {
};

struct Baz : public Foo, public Bar {
};

The reason why this work is the the rule in MATLAB that says that:

Property Conflicts If two or more superclasses define a property with the same name, then at least one of the following must be true:

  • All, or all but one of the properties must have their SetAccess and GetAccess attributes set to private
  • The properties have the same definition in all superclasses (for example, when all superclasses inherited the property from a common base class)

cf. http://es.mathworks.com/help/matlab/matlab_oop/subclassing-multiple-classes.html

jaeandersson avatar Sep 13 '15 17:09 jaeandersson

Also does not appear to work for Octave: Does not work for ":" arguments, e.g. v(:, i)

jaeandersson avatar Jul 12 '16 15:07 jaeandersson