cube icon indicating copy to clipboard operation
cube copied to clipboard

Fix created time in cubestore parquet files

Open thePermission opened this issue 10 months ago • 4 comments

Check List

  • [x] Tests have been run in packages where changes made if available
  • [ ] Linter has been run for changed code
  • [ ] Tests for the changes have been added if not covered yet
  • [ ] Docs have been added / updated if required

Issue Reference this PR resolves

https://github.com/cube-js/cube/issues/7905

Description of Changes Made

Using Parquet Metadata to store and read creation time instead of using the creation time from filesystem.

thePermission avatar Feb 17 '25 15:02 thePermission

@srh is there anything i can do to speed this up? Thats currently blocking us from working with cube, so i would be very happy to help getting this done as fast as possible :-)

thePermission avatar Mar 26 '25 08:03 thePermission

@igorlukanin what can i do to get this feature? Would be pretty important to us acutally.

thePermission avatar Apr 07 '25 05:04 thePermission

Creation time is used here not to refer to when the first copy of the parquet file was created but when the file was copied from remote to local storage. The cleanup loop is supposed to periodically clean up files (all of them) but leave those copied more recently than cleanup_local_files_delay to the file system.

So, using in-file metadata for parquet creation time is not the value we want to use for this. It might be the case that using last modified time would work just as well (and just the same) as using creation time though.

srh avatar Jul 21 '25 18:07 srh

The problem is, that not all filesystems have a creation_time and cube then failes to clean up. Currently Cube Store is not working for every filesystem this way.

thePermission avatar Jul 22 '25 09:07 thePermission