npTDMS icon indicating copy to clipboard operation
npTDMS copied to clipboard

Improve writing when streaming multiple chunks

Open adamreeve opened this issue 4 years ago • 5 comments

The TDMS file writing support in npTDMS is currently fairly simple, for every segment we always write a new object list and full raw data index for every channel. When streaming chunks of data to disk it would be nice to make use of the TDMS format features that allow reusing previous metadata to reduce file size and improve the efficiency when reading files. Some things to consider:

  • Use the 0x00000000 raw data index header when a raw data index matches the previous index
  • Don't write the segment lead in and metadata at all if none of the objects and raw data indexes have changed, just write multiple raw data chunks contiguously
  • Track property values and don't rewrite properties if they haven't changed from previous segments. Users can control which properties are written per segment and ideally would only write properties once, but it probably makes sense for npTDMS to avoid writing duplicate properties unnecessarily.

adamreeve avatar Jun 08 '21 21:06 adamreeve

Hello, this feature looks very promissing! Has there been any update? I am looking for a way to add metadata to files without overwriting the old file or format. Currently I get a TDMS error (ToRootData.cpp(210): TDMS: ERROR: TDS Exception in Initialize: Tds Error: TdsErrOffsetTooLarge(-2511): ) when trying to add any new information to an existing file. Is this a feature that will be implemented? I might have missed something? Thak you for any advice on this topic !! Best regards Zoe

zbeebee avatar Dec 29 '21 09:12 zbeebee

Hi, this issue is currently just a placeholder for something that would be nice to have to reduce file size and improve read performance for written files, but it's not something I've been working on. Your error looks like something else though, it should be possible to append to an existing file if you create a TdmsWriter with mode='a'. Do you have some example code that can reproduce the problem?

adamreeve avatar Dec 29 '21 19:12 adamreeve

Hello, Thank you for your remark! Looks like mode='a' is working!! However I still get an error, but notw it is:

***************** New logfile section ******************* ToRootData.cpp(210): TDMS: ERROR: TDS Exception in Initialize: Tds Error: TdsErrNotSupported(53):

from nptdms import TdmsWriter, RootObject, GroupObject, ChannelObject

path = 'C:\Documents\21_07_12_15_30_58_424_F0_2.tdms' `

root_object = RootObject(properties={ "prop1": "cookies", "prop3": "milk", })

with TdmsWriter(path, mode='a') as tdms_writer: # Write first segment tdms_writer.write_segment([root_object])

The code is rather plain. The file was not corrupted, tried out with 2 good ones yet. Taking any hint to make it work.

zbeebee avatar Dec 30 '21 10:12 zbeebee

Hi @zbeebee, apologies it's taken me a long time to look into this, so this might no longer be relevant to you. I believe I've reproduced this problem and it seems to be caused by npTDMS writing files with an older version number than what the file already uses. I've made a new issue for this at #264

adamreeve avatar Feb 09 '22 08:02 adamreeve

Implementing a buffered writer would also help reduce file size and increase speed. https://www.ni.com/docs/en-US/bundle/labview/page/lvconcepts/fileio_tdms_file_buffering.html

rbenji23 avatar Mar 08 '23 02:03 rbenji23