ParforProgMon icon indicating copy to clipboard operation
ParforProgMon copied to clipboard

R2020b: dot indexing is not supported for variable of this type

Open LecrisUT opened this issue 3 years ago • 9 comments

Problem Description

It seems matlab cannot access ppm.increment() within parfor.

Error using testParforProgressbar (line 21)
Dot indexing is not supported for variables of this type.

Steps to Reproduce the Problem

Run the testParforProgressbar.m example.

Specifications

  • MATLAB Release: R2020b
  • Operating System:
  • ParforProgressbar version (or commit link/hash):

Additional Information

Probably matlab has changed how it brodcasts variables in the last updates. Does anyone know how to get it back to work?

More info: The root cause seems to be that the object is not created within the parfor loop. I've tried using step = @() ppm.increment to get rid of the dot indexing problem, however using that approach, there is no increment in the progressbar.

LecrisUT avatar Jan 14 '21 12:01 LecrisUT

Thank you for your detailed bug report. I have no idea where the problem is and also do not have access to R2020b. If you find any solution to this problem, please report it too. Thanks!

fsaxen avatar Jan 14 '21 14:01 fsaxen

Could you show me some hints on how to debug such a problem? E.g. how to run it in serial mode?

Matlab R2021a is currently in pre-release. Are you able to access those versions?

LecrisUT avatar Jan 14 '21 20:01 LecrisUT

Unfortunately, I do not have access to any other version. However, I would start debugging by commenting in the body of the debug function at the end of ParforProgressbar.m This will create a txt file with some debug info. You can then place debug calls yourself if you find it useful.

fsaxen avatar Jan 15 '21 16:01 fsaxen

It seems that Matlab just passes the variables as empty double objects even if in the code dot indexing is used, because Matlab is not smart enough to determine the class before hand (At these points you can really appreciate the type constraint of lower level languages). I am not aware of any methods of telling MATLAB what class a variable is except for in functions (although that can introduce unnecessary overhead which should be avoided). Here's a quick function that can solve the dot indexing problem.

function StepPPM(ppm)
  arguments
    ppm  ParforProgressbar
  end
  ppm.increment;
end

Some things I've tested:

  • Including the type constraint in the increment method is not sufficient for matlab to deduce the class (presumably other classes can have a method with the same name).
  • Surprisingly using increment(ppm) instead of ppm.increment with no additional argument constraints the correct type is passed. So if it is instructed to not use the dot indexing method calls, half of the problems can be resolved

I suspect this is just an error on behalf of Matlab which could be fixed in later versions. I do not have any luck reporting issues to Matlab and getting responses, so if you can submit a ticket that would help.

However there is still a functional issue. I've narrowed down the issue to the constructor: it calls debug('Start worker'), but not debug('Send login cmd'). https://github.com/fsaxen/ParforProgMon/blob/f99c6f0acda1458239c24b63e6096429d8b5fcc7/ParforProgressbar.m#L111-L132 I'll do some more debugging and see if I can find the root cause and solutions.

LecrisUT avatar Jan 18 '21 08:01 LecrisUT

~~Found the issue it was the port being blocked on the default gateway. Opening it and fixing the port solved the issue.~~ Edit: I was mistakenly still using the local profile. I'll do some more investigation later. The main issue is still on https://github.com/fsaxen/ParforProgMon/blob/f99c6f0acda1458239c24b63e6096429d8b5fcc7/ParforProgressbar.m#L124-L128

Looking at the constructor, we can do some tinkering to improve the variables available, e.g. adding 'Port' as a parameter option. For that I've got a question: why is the constructor in that form? What happens if we make a constructor of the form:

o = ParforProgressbar( numIterations, args, varargin )
arguments
   numIterations
   args.Port
   ...
end

What workaround can be done so that the worker constructors can be distinguished?

LecrisUT avatar Jan 18 '21 09:01 LecrisUT

I like this new arguments block. Unfortunately, it's not available in my 2018 version. The constructor is in that form because it is called once from the user and once from each worker. To distinguish them, I just passed all the arguments as one cell. I.e., if there is only one argument and it is a cell, the constructor call must have been come from a worker. This is not a very elegant solution but it should work.

I don't know a lot about udp to be honest. It is possible, that the server port needs to be specified as well. Can you try changing the line 188 to the following code:

% Let's find an available port
for port = pct.portrange(1) : pct.portrange(2)
      try
         o.connection = udp(o.ServerName, port, 'DatagramReceivedFcn', {@receiver, o}, 'DatagramTerminateMode', 'on', 'EnablePortSharing', 'on');
         fopen(o.connection);
         break;
      catch
      end
end

fsaxen avatar Jan 18 '21 11:01 fsaxen

I've found that udp is deprecated. I am trying to use the new standard with udpport. I hope the new interface will lead to less problems. There are many changes needed for that. I see that udpport is only available from R2020b, although the tcp protocol functions seem to be available in both versions, but the interface might be drastically different (in R2020b all the connections are via tcpclient similar to how udpport are in the new version). One main change is that we cannot detect the incoming ip/port when we read the data, so the whole login stage has to be refactored to send the worker's IP/port. I think the easiest approach would be to open 2 ports: 1 for communication with increment() and another for registering, de-registering, etc.

Before I go refactoring it, are you opposed to switching the protocol to tcp? And are there any other changes you think should be added to it, since a lot of the communication will have to be restructured.

LecrisUT avatar Jan 18 '21 13:01 LecrisUT

TCP will create unfortunate overhead, because you cannot connect x workers to 1 server with one sever connection, you will need x server connections and syncing will be more complicated. With udp the server is completely separate and synchronization is much easier. Also, since udp is deprecated doesn't mean that it shouldn't work anymore. It just might not be supported in future versions. But until then it should work.

fsaxen avatar Jan 18 '21 16:01 fsaxen

udp will be removed from R2021a. So yeah should switch to udpport format.

Well, we could do both: tcp for registration and udp for update. The reliability on the registration stage could be useful in some edge cases.

Edit: one thing to consider is that all of the overhead will not be handled by Matlab, but the OS and nic. Connecting multiple tcp clients on the same server is possible as long as ip/port differs, so it would only cost more ports on the node side, which realistically is not an issue. This would also resolve race issues on udp connection over non-local parfor (often seen in debug log).

LecrisUT avatar Jan 18 '21 20:01 LecrisUT