lava icon indicating copy to clipboard operation
lava copied to clipboard

Problem running a process when function pointers or lambda is passed in proc_params in Windows system

Open bamsumit opened this issue 3 years ago • 7 comments

Objective of issue:

Lava version:

  • [ ] 0.3.0 (feature release)
  • [ ] 0.2.1 (bug fixes)
  • [x] 0.2.0 (current version)
  • [ ] 0.1.2

I'm submitting a ...

  • [x] bug report
  • [ ] feature request
  • [ ] documentation request

Current behavior:

  • This is a problem observed in windows only. We get a pickling error while running.

Expected behavior:

  • It should run without error.

Steps to reproduce:

  • Swap the DummyDataloader in tests/lava/proc/io/test_dataloader.py to the following code

Related code:

class DummyDataset:
    def __init__(self, shape, transform=lambda x : x) -> None:
        self.shape = shape
        self.transform = transform

    def __len__(self) -> int:
        return 10

    def __getitem__(self, id: int) -> Tuple[np.ndarray, int]:
        data = np.arange(np.prod(self.shape)).reshape(self.shape) + id
        data = data % np.prod(self.shape)
        label = id
        return self.transform(data), label

Other information:

This errors out with can't pickle the lambda function at the point where run tries to spawn multiprocessing job in Windows.

bamsumit avatar Feb 26 '22 03:02 bamsumit

This is an issue because of how multiprocessing works in windows. Since Python does not support pickling of function pointer or lambda, we see the error above.

bamsumit avatar Feb 26 '22 03:02 bamsumit

The workaround is DO NOT USE LAMBDAS IN proc_params.

bamsumit avatar Feb 26 '22 03:02 bamsumit

Is this just limited to ProcModels? We use lambda functions all over the place in PyPorts to 'reduce' over multiple CSP Ports. But the use of lambdas does not seem necessary because all we are doing right now is a reduce_sum which could also be achieved via: https://numpy.org/doc/stable/reference/generated/numpy.ufunc.reduce.html

This might also be more performant than our current 'lambda' approach.

awintel avatar Feb 26 '22 22:02 awintel

I think so. It seems to be a problem when the process model get's spawned in Windows multiprocessing thread which is the case for ProcModels.

bamsumit avatar Feb 28 '22 07:02 bamsumit

Any idea why it did not cause issues when we use lamba functions in PyPorts?

awintel avatar Feb 28 '22 22:02 awintel

This seems to be a specific issue when proc_params has a lambda / function pointer.

bamsumit avatar Mar 03 '22 15:03 bamsumit

YEs, I understood this now after our sync. Let's just document it for now so users know. If you think this should not be the case, then feel free to file an issue. Thanks.

awintel avatar Mar 03 '22 16:03 awintel