MIES icon indicating copy to clipboard operation
MIES copied to clipboard

AnalysisBrowser: Make it faster

Open t-b opened this issue 1 year ago • 2 comments

  • [ ] Make it fully resizable
  • [ ] Add checkboxes for loading TPStorage and Results waves (unchecked by default)
  • [ ] Add checkboxes for gathering NWB/PXP files separately, by default only NWB is checked
  • [ ] Default to collapse, collapse after NWB loading loading, keep collapsed in general
  • [ ] Skip loading labnotebook/results temp waves, this probably only needs to use the new recursive flag introduced in 2237ea315f (AB: Add optional recursive argument to AB_LoadDataWrapper, 2024-03-15)
  • [ ] Checkbox to load into the same sweepbrowser, defaults to false
  • [ ] Button to load /general/data_source as readonly notebook as that has the experiment history and logfiles.

t-b avatar Apr 19 '24 14:04 t-b

What bugs me in the AB is that the default behavior of Windows does not work for selecting.

Shift + MouseDown -> select all rows from the the last postition to the current one Ctrl-A -> select all

MichaelHuth avatar Apr 19 '24 14:04 MichaelHuth

@MichaelHuth I guess this is because IP is comming from a Mac.

t-b avatar Apr 19 '24 14:04 t-b

MIES HistoricData set, numbers are in [s].

Refresh: NWB: 3.15 PXP: 1.5

Load Sweeps: NWB: 13.5 PXP: 29

main: Refresh: NWB: 4 PXP: 3.4

Load Sweeps: NWB: 11.7 PXP: 29.5

Note that an open Igor Data Browser window increases loading times by a factor of 3.

MichaelHuth avatar Mar 21 '25 14:03 MichaelHuth

I've fixed the formatting issues. LGTM.

t-b avatar Mar 23 '25 14:03 t-b

@MichaelHuth @t-b Wow! It's so much faster to screen folder structures for MIES files.

I found a couple of issues.

  !!! Threadsafe assertion FAILED !!!
  Message: "Invalid version: 0"
  Please provide the following information if you contact the MIES developers:
  ################################
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Stacktrace:
  AB_ButtonProc_AddFolder(...)#L3026 [MIES_AnalysisBrowser.ipf]
AB_AddExperimentEntries(...)#L2647 [MIES_AnalysisBrowser.ipf]
AB_AddFile(...)#L254 [MIES_AnalysisBrowser.ipf]
AB_AddMapEntry(...)#L134 [MIES_AnalysisBrowser.ipf]
GetNWBMajorVersion(...)#L297 [IPNWB_NWBUtils.ipf]
AnalyzeNWBVersion(...)#L328 [IPNWB_NWBUtils.ipf]
EnsureValidNWBVersion(...)#L336 [IPNWB_NWBUtils.ipf]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Time: 2025-04-03T09:24:54-07:00
  Experiment: Untitled ()
  Igor Pro version: 9.0.6.1 (56565)
  ################################

It would be better to skip and print to history for invalid files, rather than assert. In either case, providing the user with enough information to find the file is essential.

The Remove button only removes one list item at a time when multipe list items are selected. image

timjarsky avatar Apr 03 '25 16:04 timjarsky

@timjarsky

It's so much faster to screen folder structures for MIES files.

Nice!

Message: "Invalid version: 0"

That should not happen. Can you post upload the file so that we can have a look?

t-b avatar Apr 03 '25 16:04 t-b

@t-b

That should not happen. Can you post upload the file so that we can have a look?

I don't know how to find it - I had it combing over a massive file tree. If the assertion message were to include the file path, I can try again.

timjarsky avatar Apr 03 '25 17:04 timjarsky

I don't know how to find it - I had it combing over a massive file tree. If the assertion message were to include the file path, I can try again.

Will do!

t-b avatar Apr 03 '25 17:04 t-b

Fixed conflicts and linted with new ipt version.

t-b avatar Apr 05 '25 12:04 t-b

Some remark regarding the row selection interface that should work like windows - but not exactly.

Igor has some specifics, such that selection by shift + click is independent from selection by click. Normal windows behavior:

  • you select rows 1, 4, 7 with ctrl + click in that order
  • you select row 5 with shift + click Result: row 5, 6, 7 is selected as with the last shift + click row 1, 4 automatically get unselected.

Igor behavior:

  • you select rows 1, 4, 7 with ctrl + click in that order
  • you select row 5 with shift + click Result: row 1, 4, 5, 6, 7 is selected. The last shift + click selects independently of the previou selection and no previous selection is unselected automatically.

MichaelHuth avatar Apr 07 '25 17:04 MichaelHuth

   ? Encountered "Abort" in test case "HistoricEochClipping#TestEpochClipping" (UTF_HistoricEpochClipping.ipf)

is reproducible on the CI machine.

t-b avatar Apr 07 '25 21:04 t-b

When looking through the code to minimize opening NWB file all over again I noticed that the loading logic actually includes double loading if expanded ranges and other selections overlap.

This means if you have e.g. a single file then expand it in the bottom view, select everything. Then it loads everything twice.

Fixing this needs a bit more effort, as the whole AB loading logic needs to rewritten. We need to split that in a gathering step and a second loading step. This allows to optimize the gathered loads by e.g. filename (required to implement the re-opening optimization) and removal of duplicates.

MichaelHuth avatar Apr 08 '25 11:04 MichaelHuth

@timjarsky There should be a message printed when the problematic NWB is encountered and the program should continue. Also "Remove" should work as intended.

MichaelHuth avatar Apr 08 '25 21:04 MichaelHuth

@t-b Two additional requests:

  1. Add a right-click option to open the file location (for files and folders in both lists) in the analysis browser.

  2. Have assertions on sweep load print the file location.

These additions will help me track down problematic NWB files. We should handle resolving the problematic NWB files in a separate issue.

timjarsky avatar Apr 10 '25 18:04 timjarsky

@MichaelHuth I'm unassigning myself

t-b avatar Apr 10 '25 19:04 t-b

@timjarsky I added a context menu to both listboxes that allows to jump to the file/folder in the explorer.

I also adapted the error handling and message output for sweep loading.

MichaelHuth avatar Apr 11 '25 13:04 MichaelHuth

@MichaelHuth, thanks for the latest feature enhancements. Adding an option to copy the path was a thoughtful addition.

The previous version of the analysis was too slow to work with one of our production folders on our network. This version at least makes it possible - see the function proficlig results from having to work through two production data folders below.

I'm going to approve the PR as is.

Total time: 1210.94, Time in Function code: 1060.1 (87.5%)
Top function percentages:
Function MIES_Utilities_File.ipf GetAllFilesRecursivelyFromPath: 95%

Annotated Top Functions:

*******************************************************************************************
Function: MIES_Utilities_File.ipf GetAllFilesRecursivelyFromPath; Percent total 95%
*******************************************************************************************
[00]*         	|Function/S GetAllFilesRecursivelyFromPath(string pathName, [string extension])
[00]          	|
[00]          	|	string fileOrPath, folders, subFolderPathName, fileName
[00]          	|	string files, allFilesList
[00]*         	|	string allFiles         = ""
[00]*         	|	string foldersFromAlias = ""
[00]          	|	variable err
[00]          	|
[00]*         	|	PathInfo $pathName
[00]*         	|	ASSERT(V_flag, "Given symbolic path does not exist")
[00]          	|
[00]*         	|	if(ParamIsDefault(extension))
[00]*         	|		extension = "????"
[00]*         	|	endif
[00]          	|
[00]*         	|	AssertOnAndClearRTError()
[56]******    	|	allFilesList = IndexedFile($pathName, -1, extension, "????", FILE_LIST_SEP); err = GetRTError(1)
[00]*         	|	WAVE/T allFilesInDir = ListToTextWave(allFilesList, FILE_LIST_SEP)
[00]*         	|	for(fileName : allFilesInDir)
[00]          	|
[00]          	|		fileOrPath = ResolveAlias(fileName, pathName = pathName)
[00]          	|
[00]*         	|		if(isEmpty(fileOrPath))
[00]          	|			// invalid shortcut, try next file
[00]          	|			continue
[00]          	|		endif
[00]          	|
[00]*         	|		GetFileFolderInfo/P=$pathName/Q/Z fileOrPath
[00]*         	|		ASSERT(!V_Flag, "Error in GetFileFolderInfo")
[00]          	|
[00]          	|		if(V_isFile)
[00]*         	|			allFiles = AddListItem(S_path, allFiles, FILE_LIST_SEP, Inf)
[00]          	|		elseif(V_isFolder)
[00]          	|			foldersFromAlias = AddListItem(S_path, foldersFromAlias, FILE_LIST_SEP, Inf)
[00]          	|		else
[00]          	|			ASSERT(0, "Unexpected file type")
[00]          	|		endif
[00]*         	|	endfor
[00]          	|
[00]*         	|	AssertOnAndClearRTError()
[04]*         	|	folders = IndexedDir($pathName, -1, 1, FILE_LIST_SEP); err = GetRTError(1)
[00]*         	|	folders = folders + foldersFromAlias
[00]*         	|	WAVE/T wFolders = ListToTextWave(folders, FILE_LIST_SEP)
[00]*         	|	for(folder : wFolders)
[00]          	|
[00]*         	|		subFolderPathName = GetUniqueSymbolicPath()
[00]          	|
[27]***       	|		NewPath/Q/O $subFolderPathName, folder
[00]*         	|		files = GetAllFilesRecursivelyFromPath(subFolderPathName, extension = extension)
[08]*         	|		KillPath/Z $subFolderPathName
[00]          	|
[00]*         	|		if(!isEmpty(files))
[00]*         	|			allFiles = AddListItem(RemoveEnding(files, FILE_LIST_SEP), allFiles, FILE_LIST_SEP, Inf)
[00]*         	|		endif
[00]*         	|	endfor
[00]          	|
[00]*         	|	return allFiles
[00]*         	|End

timjarsky avatar Apr 12 '25 03:04 timjarsky

@timjarsky Interesting function profiling results. Another topic: You found an assertion in one of the NWB files. See https://github.com/AllenInstitute/MIES/pull/2092#issuecomment-2776353454. Do we want to dig more know or make that future problem? Both is fine for me.

t-b avatar Apr 13 '25 11:04 t-b

@timjarsky I assume you do have a 1GB/s network link to the fileserver? And about how many files are we talking about? 1000? 10000? More?

We might still be able to tune GetAllFilesRecursivelyFromPath for speed if we know a bit more about the requirements.

t-b avatar Apr 13 '25 12:04 t-b

@timjarsky Interesting function profiling results. Another topic: You found an assertion in one of the NWB files. See #2092 (comment). Do we want to dig more know or make that future problem? Both is fine for me.

Future problem - I want to understand the frequency and type of issue(s) related to NWB file loading before tackling any of them. I will create issues as I find problem files.

timjarsky avatar Apr 14 '25 16:04 timjarsky

@timjarsky I assume you do have a 1GB/s network link to the fileserver? And about how many files are we talking about? 1000? 10000? More?

We might still be able to tune GetAllFilesRecursivelyFromPath for speed if we know a bit more about the requirements.

@t-b I had the analysis browser browsing both these network folders simultaneously. image

timjarsky avatar Apr 14 '25 17:04 timjarsky

@timjarsky That's quite a challenge the number of folders/files. I've raised https://github.com/AllenInstitute/MIES/issues/2402 to track that.

t-b avatar Apr 14 '25 17:04 t-b

@MichaelHuth We have some low-hanging fruits to optimize in AB_AddExperimentEntries as we are calling GetAllFilesRecursivelyFromPath three times. This can be optimized to 1 call for NWB-only and 2 for pxp-only.

t-b avatar Apr 14 '25 17:04 t-b

I've added https://github.com/AllenInstitute/MIES/pull/2403 which allows me to gather a list of 100k files over the network in less than 60s. The culprit is that alias resolving takes a lot of time, but we can skip it for the most common cases. Once this PR is merged we can rebase this one here and try again.

t-b avatar Apr 14 '25 18:04 t-b

@MichaelHuth

Update Packages/doc/analysisbrowser.rst after final approval

This can be done as well as the UI does not change anymore in this PR.

t-b avatar Apr 14 '25 18:04 t-b

LGTM. Will merge once CI passes.

t-b avatar Apr 15 '25 17:04 t-b

@timjarsky I'd like to merge this first as @MichaelHuth was faster here than I in #2403 if you are okay with that.

t-b avatar Apr 15 '25 18:04 t-b

@timjarsky I'd like to merge this first as @MichaelHuth was faster here than I in #2403 if you are okay with that.

go for it! thank you both!!

timjarsky avatar Apr 15 '25 20:04 timjarsky

@MichaelHuth I'll restart the CI job tomorrow, as I think this is a fluke.

t-b avatar Apr 15 '25 21:04 t-b