MIES icon indicating copy to clipboard operation
MIES copied to clipboard

Insert recreated epoch info on massexport

Open MichaelHuth opened this issue 8 months ago • 19 comments

  • [x] Needs https://github.com/AllenInstitute/MIES/pull/2432
  • [ ] Needs https://github.com/AllenInstitute/MIES/pull/2569
  • [ ] Wait for epoch support in ipfx
  • [ ] Test a full QC run with recreated epoch info on a pxp without epoch info

close #2076

close #881

Close #2009

MichaelHuth avatar Apr 02 '25 16:04 MichaelHuth

@t-b This PR now always recreates the epochs, whereas the issue states epochs should only be recreated if we do not have any epoch data.

However, we recently changed the epoch (re)creation to be sample exact. With the upper approach we would get a mix of old non-sample exact epochs and "new" sample exact recreated epochs possibly on exported older files.

If that mixing is ok, I add the check to only recreate if there is no epoch info already.

MichaelHuth avatar Apr 02 '25 16:04 MichaelHuth

However, we recently changed the epoch (re)creation to be sample exact. With the upper approach we would get a mix of old non-sample exact epochs and "new" sample exact recreated epochs possibly on exported older files.

If that mixing is ok, I add the check to only recreate if there is no epoch info already.

I think we want to regenerate if the epoch info is not-present or out-of-date (SWEEP_EPOCH_VERSION).

t-b avatar Apr 02 '25 19:04 t-b

Not a review just some drive-by-comments:

  • ED_InsertPostProcessRow should work more like ED_AddEntriesToLabnotebook. This would give the same speed benefits (writing all entries in one go) but alleviate the caller of the burden of to manage the internals. Or maybe ED_AddEntriesToLabnotebook should gain a flag for adding a post processing entry?
  • Your code does not handle unassociated channels, CreateLBNUnassocKey can derive an LBN key for unassociated channels. We support that since 406440a53 (Merge pull request #1783 from AllenInstitute/feature/1783-add_epoch_info_for_unassoc_DA, 2023-06-30).
  • The post processed lbn entry should have the unit LABNOTEBOOK_BINARY_UNIT
  • You probably want an assertion after FindRange in case this did return NaN for first and last row

t-b avatar Apr 02 '25 21:04 t-b

I've reapplied the ipt formatting and linting changes. But don't know howto fix the merge conflict.

t-b avatar Apr 05 '25 12:04 t-b

@t-b Look slike the double key check for LBN entry adds actually found a case. In https://github.com/AllenInstitute/MIES/commit/061808a7dfabe27e2e3d816612778cf53cc49613#diff-3c3244d0695973447e26d7c941860a630c1fd265833753040b49c4b3d7c20520 I introduced "Sampling interval multiplier" at index 26, where an entry of the same key was already existing at 36 (after the change at 38).

Probable fallout: The writing to the value source wave in DC_DocumentChannelProperty uses FindDimLabel to determine the column, so it writes to 38.

The copying in ED_createWaveNotes runs the indexing upwards, such that the last value written to the LBN is read from source index 38.

Thus, there should be no side effects.

MichaelHuth avatar Apr 07 '25 13:04 MichaelHuth

Does that also resolve the #881?

Review:

59c47e742 (DAP: Change comment getter for user comment to retrieve string directly from control, 2025-04-02)

Jeep.

5c982a337 (LBN: Do labnotebook upgrade on most common MIES entry points, 2025-04-02)

  • [x] The commit message is not correct:

    • We upgrade on device locking for DAEphys
    • upgrade on every sweep plot (maybe we can even optimize that?)
    • upgrade on NWB export
  • [x] I don't see the upgrade on analysis browser data loading in the diff.

970d2ab90 (bugfix: Handle missing result waves gracefully in Logbookviewer, 2025-04-02)

  • [x] textualValues and numericalValues do exist, but they only exist with the default entries which are removed from LBV_CleanLogbookParamNames, right? The code change now results in the cache not being used. The problem is that the cache is not able to store null wave refs. We could hack our way here with an empty wave and translate the empty wave to a null wave ref in LBV_GetAllLogbookParamNames. Or we store a wave ref wave? Or?

6e1855b86 (ED: Factor out LNB row initialization to ED_InitNewRow, 2025-04-02)

Looks good.

4f015046b (ED: Remove unused variable in ED_FindIndizesAndRedimension, 2025-04-03)

Nice find. Hopefully a future IPT version can help us here as well.

6801110b2 (ED: Add check against duplicated keys in ED_CheckValuesAndKeys, 2025-04-03)

I'm trying to come up with a faster way, but fail to see one.

  • [x] The only possible optimization is to not do anything if keys has only one column
  • [x] Oh and missing newline after the ASSERTs before your code.

The reason I think the one column check is relevant is that all analysis function LBN values are added via ED_AddEntryToLabnotebook which only adds one entry.

dace78715 (Bugfix: SweepSettings Key wave had duplicate key "Sampling Interval Multiplier", 2025-04-07)

Nice.

d5e9e9d6f (CA: Refactor cache key generation for LB row and index cahce to own functions, 2025-04-04)

  • [x] Typo in subject line.

Otherwise nice refactoring.

403130ac5 (AFH: Add function to get headstage from channelType/channelNumber, 2025-04-02)

  • [x] That should use an elseif instead of the second if or maybe even a switch?

d67d5b033 (ED: Add feature to ED_AddEntriesToLabnotebook for value insertion, 2025-04-02)

  • [x] The commit message says "sweep block" but I think this is more like the entry source type block or?

  • [x] ED_AddEntriesToLabnotebook I think finally deserves some proper documentation.

  • [x] ED_SetLabnotebookRowToPostProcessed: We want newlines after each logical block inside functions.

  • [x] ED_SetLabnotebookRowToPostProcessed: Missing assertion for out of range row.

  • [x] The headstage contigency mode for PostProcessed is not DEPEND, but ALL, see ED_ParseHeadstageContigencyMode how that resolves. If it would be DEPEND you would not set someting in the INDEP_HEADSTAGE layer.

  • [x] ED_InsertRowAfterSweepBlock: Two lines for querying and writing NOTE_INDEX please.

  • [x] ED_FindIndizesAndRedimension: It does still resize the rows in values, but now you are not setting the updated NOTE_INDEX in ED_createWaveNote/ED_createTextNotes anymore when inserting post processing. Maybe you should pass insertAsPostProc into ED_FindIndizesAndRedimension and set the note index there if required? Moving SetNumberInWaveNote can also be done in a prep commit. There is also a test missing which would have failed with this bug, it should be enough if InsertRowForProstProcessingTextual/InsertRowForProstProcessingNumerical checks NOTE_INDEX.

  • [x] This should also raise LABNOTEBOOK_VERSION.

3e42a4f8c (NWB: Insert recreated epoch data into LNB on NWB_ExportAllData, 2025-04-02)

  • [x] InvalidateLBIndexAndRowCache looks wrong in this commit.

  • [x] InsertRecreatedEpochsIntoLBN: The two loops try to be future proof but the inner one is wrong. The number of channels depends on the channel type. So I would just drop that and only do it for DAC. Maybe just iterate over recEpochs?

  • [x] If you use IsAssociatedChannel instead of IsValidHeadstage you can drop the commit as the code explains itself.

  • [x] Using AFH_GetHeadstageFromChannelNumberAndType is 100% wrong. This can only be used during DAQ in analysis functions. Maybe iterate through recEpochs and then fetch the headstage? This seems to be a repeating error. Anything we can do to avoid that?

  • [x] I would prefer to resize keys and values for unassociated entries as gotassocValues looks odd and then have only one call to ED_AddEntriesToLabnotebook.

  • [x] The move of NWB_WriteLabnotebooksAndComments is unmotivated, and if done should be separate.

  • [x] Dedicated historic test adding post processed epoch

  • [x] This misses to set SWEEP_EPOCH_VERSION in the LBN. We need that for future fixing up.

cb88330f1 (Tests: Always attempt to recreate Epochs in TestExportingDataToNWB, 2025-04-03)

  • [x] Motivation?

52abcfc1b (LBN: Fix typos in description of labnotebook entries, 2025-04-04)

Nice!

t-b avatar Apr 08 '25 19:04 t-b

@MichaelHuth Did you see my review in https://github.com/AllenInstitute/MIES/pull/2385#issuecomment-2787486825?

t-b avatar Apr 09 '25 10:04 t-b

I dropped https://github.com/AllenInstitute/MIES/commit/403130ac5d1ed2838b6e5ec036fd61109a5005ad (AFH: Add function to get headstage from channelType/channelNumber, 2025-04-02)

in favor of a LBN based utility function based on the approach used in DataConfigurationRecreation.

MichaelHuth avatar Apr 09 '25 14:04 MichaelHuth

ED_FindIndizesAndRedimension: It does still resize the rows in values, but now you are not setting the updated NOTE_INDEX in ED_createWaveNote/ED_createTextNotes anymore when inserting post processing. Maybe you should pass insertAsPostProc into ED_FindIndizesAndRedimension and set the note index there if required? Moving SetNumberInWaveNote can also be done in a prep commit. There is also a test missing which would have failed with this bug, it should be enough if InsertRowForProstProcessingTextual/InsertRowForProstProcessingNumerical checks NOTE_INDEX.

I noticed that, but I did not evaluate this as bug. The only thing that the call to ED_FindIndizesAndRedimension does when called is to increase the row size of the wave by 1, but not the NOTE_INDEX. (The possible column increase is as regular).

The NOTE_INDEX is updated by the caller in any case.

I think ED_FindIndizesAndRedimension should be split in two parts, one that increases the COLS and another that increases the ROWS. For the insertion I only need the COLS increase as I use another function to add a row.

I am open to move the NOTE_INDEX adjustment before the row value update.

MichaelHuth avatar Apr 09 '25 14:04 MichaelHuth

  • [x] Optimize upgrade LBN in DB. Move from DB_UpdateSweepPlot to device locking of DB.
  • [x] https://github.com/AllenInstitute/MIES/commit/970d2ab90e4e9ef02fd396037649cba9a51bb1f0 (bugfix: Handle missing result waves gracefully in Logbookviewer, 2025-04-02) Put result in WaveRef wave that also null ref can be pushed to Cache and Retrieve accordingly.

MichaelHuth avatar Apr 09 '25 15:04 MichaelHuth

The commit e26bbaf15de5e02dd25117a28d796b94bb4aab32 that includes the TTL Epoch should also solve #881 .

MichaelHuth avatar Apr 10 '25 16:04 MichaelHuth

Review:

59c47e742 (DAP: Change comment getter for user comment to retrieve string directly from control, 2025-04-02)

Good.

f9ab329da (LBN: Do labnotebook upgrade on most common MIES entry points, 2025-04-02)

Nice!

48964f017 (DB: Optimize LBN Upgrade location, 2025-04-10)

Nice one. And now I remember why we did not do that earlier. Only since 2b8c2b1af (DeepCopyWaveRefWave: Allow src to contain null waves, 2024-09-06) do we allow DeepCopyWaveRefWave, which we use in CA_TryFetchingEntryFromCache, on wave reference waves which have null refs in them.

I've now also created https://github.com/AllenInstitute/MIES/issues/2398 to have the generic solution.

b19d9ee8b (bugfix: Handle missing result waves gracefully in Logbookviewer, 2025-04-02) 4f42718ad (ED: Factor out LNB row initialization to ED_InitNewRow, 2025-04-02) 66f5796be (ED: Remove unused variable in ED_FindIndizesAndRedimension, 2025-04-03) f4ad63588 (ED: Add check against duplicated keys in ED_CheckValuesAndKeys, 2025-04-03)

All good.

14155ff55 (Bugfix: SweepSettings Key wave had duplicate key "Sampling Interval Multiplier", 2025-04-07) 42167e972 (CA: Refactor cache key generation for LB row and index cache to own functions, 2025-04-04)

All good.

8ea5ee9bd (ED: Add feature to ED_AddEntriesToLabnotebook for value insertion, 2025-04-02)

  • [x] Function documentation of ED_createWaveNotes still speaks about the sweep block
  • [x] typo: "the the"
  • [x] Same comment: It is not correct that we increase it only by a row. EnsureLargeEnoughWave does geometrical growth, I would just say "that is is increased if required".

a43c55c94 (LBN: Add util function to retrieve HS for Sweep/ChannelType/Number combination from LBN, 2025-04-09)

  • [x] Why does GetHeadstageForChannel not work here?

8f25421f6 (ED: Add function docu for ED_AddEntriesToLabnotebook, 2025-04-10)

  • [x] typo: "vlas"
  • [x] Please wrap long lines
  • [x] > /// The entrySourceType specifies to which logical layer of the LBN the entries get added entrySourceType has nothing to do with layer.
  • [x] "for TTL channels create the correct key with CreateTTLChannelLBNKey with layer INDEP_HEADSTAGE in vals" only some LBN entries use that function others use whatever we came up with back then.
  • [x] "sweep block" -> block

Otherwise nice read.

59e1a1680 (Const: Introduce constant for "Epochs version" LBN key, 2025-04-10)

Good.

6465a200d (NWB: Recreate Epochs on export and insert this info into LBN, 2025-04-09)

InsertRecreatedEpochsIntoLBN:

  • [x] I think allocatedCols is unused
  • [x] In which world is colCount still zero after the for loops. We always have at least one DA channel. ASSERT(colCount > 0, ...) maybe?

FindLastLBNRow:

  • [x] Brief function documentation would be helpful
  • [x] Drop the return variable row per our coding conventions

e26bbaf15 (LBN PostProc: Also add TTL epochs to postproc LBN insertion on NWB export, 2025-04-10)

Nice one!

4dd392c1c (LBN: Fix typos in description of labnotebook entries, 2025-04-04)

Still good.

b9db13e12 (Fix: Fix naming of HeadstageContigency -> HeadstageContingency, 2025-04-10)

Good find.

I've played around with an old experiment (FTP: pr-2385/Chat-IRES-Cre-neo;Ai14-307622.04.02.01.zip, new credentials in corg MR !263):

Steps:

  • Open PXP
  • Execute NWB_ExportAllData(2, recreateEpochs = 1)

This gives

•nwb_exportAllData(2, recreateEpochs = 1)
  Please be patient while we export all existing acquired content of all devices to NWB
  !!! Threadsafe assertion FAILED !!!
  Message: "Labnotebook is too old for NWB export."
  Please provide the following information if you contact the MIES developers:
  ################################
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Stacktrace:
  NWB_ExportAllData(...)#L735 [MIES_NeuroDataWithoutBorders.ipf]
NWB_AppendSweepLowLevel(...)#L1118 [MIES_NeuroDataWithoutBorders.ipf]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Time: 2025-04-11T20:30:27+02:00
  Experiment: Chat-IRES-Cre-neo;Ai14-307622.04.02.01 (Packed)
  Igor Pro version: 9.0.6.1 (56704)
  ################################

It seems to be unable to query the "Clamp Mode" entry from the labnotebook. The reason is that the labnotebook now looks like

grafik

What happened is that we entered an entry source type value into a labnotebook which did not use entry source types before. But then our labnotebook reading logic found an entry source type and said, hey, this is looking modern (see FindRange), and respected that entry source type.

So you only ever get row 5 for sweep 0 when asking for a DATA_ACQUISITION_MODE entry.

The solution is probably to enter an entry source type of UNKNOWN_MODE (aka NaN) on adding postprocessed entries when all entries in the entrySourceType column are NaN (or empty for the textual labnotebook).

t-b avatar Apr 11 '25 18:04 t-b

@t-b I added LBN capabilities. However, I am not sure if it makes sense to just consider it when inserting. Because if one adds a new entry it will have an EntrySourceType (behavior as before). And then the stored capabilities contradict with the actual LBN state as the LBN then starts supporting EntrySourceType after row X.

Also the epoch recreation asserts out on your PXP file because it is some cursed state that WB can not recreate the Stimset, then some Epoch lengths turn out to be zero. That causes an ASSERT in the core calculation.

MichaelHuth avatar Apr 14 '25 22:04 MichaelHuth

@t-b I added LBN capabilities. However, I am not sure if it makes sense to just consider it when inserting. Because if one adds a new entry it will have an EntrySourceType (behavior as before). And then the stored capabilities contradict with the actual LBN state as the LBN then starts supporting EntrySourceType after row X.

Yes I think we have to consider it also when adding.

Also the epoch recreation asserts out on your PXP file because it is some cursed state that WB can not recreate the Stimset, then some Epoch lengths turn out to be zero. That causes an ASSERT in the core calculation.

I'll have a look once this is ready for the next round.

t-b avatar Apr 15 '25 09:04 t-b

Note to self: Check that the function that creates empty recreated epochs returns $"" for no epoch info.

t-b avatar Apr 28 '25 15:04 t-b

I've looked through the code and played around with it. It works here.

As you have to rebase anyway due to the merge conflict, please also revise the commit message of 1f568dd2e (LBN: Store LBN capabilities on LBN Upgrade, 2025-04-14) as that does not explain why we are adding the entry source type capability.

t-b avatar Apr 29 '25 17:04 t-b

@timjarsky Ready to test. The function NWB_ExportAllData gained a new optional parameter recreateEpochs which recreates the epoch info if there is none or its version is out of date.

t-b avatar Apr 30 '25 14:04 t-b

Fixed conflicts.

t-b avatar Aug 05 '25 12:08 t-b

Recovered from borked-up conflict resolution.

t-b avatar Aug 08 '25 13:08 t-b