MIES icon indicating copy to clipboard operation
MIES copied to clipboard

AB: Parallelize loading of data from PXP/NWB files

Open MichaelHuth opened this issue 8 months ago • 1 comments

The AB could run much faster if the data loading could be parallelized, for e.g. Refresh, Load Sweeps etc.

Current Situation:

  • LoadData to load data from PXPs is not threadsafe
  • The HDF5 operations are threadsafe, but seem to use a global lock.

Unless one or both limitations are resolved there is no gain in parallelization of loads.

The following code was used to bench the HDF5 loading:

threadsafe Function LoadHDf5File(string discLocation)
	NewDataFolder/O root:subFolder
	variable h5_fileID = H5_OpenFile(discLocation)
	HDF5LoadGroup/O/R=1 root:subFolder, h5_fileID, "."
	H5_CloseFile(h5_fileID)
End


Function test2()

	variable ti
	string nwbList
	
	string entry = "c:Projects:mies_data:881"

	string symbPath = GetUniqueSymbolicPath()
	NewPath/O/Q/Z $symbPath, entry
	nwbList = GetAllFilesRecursivelyFromPath(symbPath, extension = ".nwb")
	KillPath/Z $symbPath
	
	WAVE/T fileList = ListToTextWave(nwbList, FILE_LIST_SEP)
	Make/FREE/N=(DimSize(fileList, ROWS)) results
	
	ti = stopmstimer(-2)
	MultiThread/NT=16 results[] = LoadHDf5File(filelist[p])
	print/D "Multithreaded [s]: ", (stopmstimer(-2) - ti) / 1E6

	ti = stopmstimer(-2)
	results[] = LoadHDf5File(filelist[p])
	print/D "Sequential [s]: ", (stopmstimer(-2) - ti) / 1E6
End

with results from 24 files in the list:

Multithreaded [s]:   30.0324111000366
Sequential [s]:   29.5686949000244

MichaelHuth avatar Mar 31 '25 14:03 MichaelHuth

Check with Wavemetrics if the NWB locking is done in IP or in the HDF5 library.

t-b avatar Mar 31 '25 15:03 t-b