amoro icon indicating copy to clipboard operation
amoro copied to clipboard

[Improvement]: we should delete the metadata.json after snapshot is expired

Open Aireed opened this issue 7 months ago • 2 comments

Search before asking

  • [x] I have searched in the issues and found no similar issues.

What would you like to be improved?

format: iceberg/mixed_hive module: AMS

A large number of metadata.json files still exist in the metatdata directory, even though a snapshot expiration has been performed (the latest metadta.json has the expected number of snapshots wrapped in it).

How should we improve?

delete the metadata.json

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

Aireed avatar May 15 '25 06:05 Aireed

@nicochen Do you have time to fix it.

Aireed avatar May 15 '25 06:05 Aireed

@Aireed If the task is not assigned, I want to improve it

zhangwl9 avatar Jun 09 '25 12:06 zhangwl9

@Aireed Currently, TableMaintainer#expireSnapshots will clean up related metadata.json and TableMaintainer#cleanOrphanFiles will cleans up non-referenced metadata.json. Will there still be some unnecessary metadata.json after these two cleanup jobs are processed?

zhangwl9 avatar Jul 17 '25 06:07 zhangwl9

@Aireed Currently, TableMaintainer#expireSnapshots will clean up related metadata.json and TableMaintainer#cleanOrphanFiles will cleans up non-referenced metadata.json. Will there still be some unnecessary metadata.json after these two cleanup jobs are processed?

sorry for late reply. don’t need this feature for now. I’m closing this issue.

Aireed avatar Jul 28 '25 02:07 Aireed