MIES
MIES copied to clipboard
AB: Parallelize loading of data from PXP/NWB files
The AB could run much faster if the data loading could be parallelized, for e.g. Refresh, Load Sweeps etc.
Current Situation:
- LoadData to load data from PXPs is not threadsafe
- The HDF5 operations are threadsafe, but seem to use a global lock.
Unless one or both limitations are resolved there is no gain in parallelization of loads.
The following code was used to bench the HDF5 loading:
threadsafe Function LoadHDf5File(string discLocation)
NewDataFolder/O root:subFolder
variable h5_fileID = H5_OpenFile(discLocation)
HDF5LoadGroup/O/R=1 root:subFolder, h5_fileID, "."
H5_CloseFile(h5_fileID)
End
Function test2()
variable ti
string nwbList
string entry = "c:Projects:mies_data:881"
string symbPath = GetUniqueSymbolicPath()
NewPath/O/Q/Z $symbPath, entry
nwbList = GetAllFilesRecursivelyFromPath(symbPath, extension = ".nwb")
KillPath/Z $symbPath
WAVE/T fileList = ListToTextWave(nwbList, FILE_LIST_SEP)
Make/FREE/N=(DimSize(fileList, ROWS)) results
ti = stopmstimer(-2)
MultiThread/NT=16 results[] = LoadHDf5File(filelist[p])
print/D "Multithreaded [s]: ", (stopmstimer(-2) - ti) / 1E6
ti = stopmstimer(-2)
results[] = LoadHDf5File(filelist[p])
print/D "Sequential [s]: ", (stopmstimer(-2) - ti) / 1E6
End
with results from 24 files in the list:
Multithreaded [s]: 30.0324111000366
Sequential [s]: 29.5686949000244
Check with Wavemetrics if the NWB locking is done in IP or in the HDF5 library.