remotePARTS
remotePARTS copied to clipboard
improve memory of parallel fitGLS_partition
Problem
The parallel partitioned GLS is driven by the function MC_GLSpart()
. This function utilizes foreach(i = 1:npart, ...) %dopar% {...}
syntax. This formulation has the entire dataset imported on each instance (thread). That leads to memory usage snowballing quite quickly (ncores $\times$ the size of the data object).
Solution
foreach()
accepts an iterator
that allows data to be constructed on the fly. In short, this could allow only the data from the partition of interest to be imported for a given instance. The upshot is that the total memory usage shouldn't be much greater than the total size of the original object. So, we should swap i = 1:npart
with an iterator to provide partitions.
The recommended solution to #13 may also be useful here.