bbolt icon indicating copy to clipboard operation
bbolt copied to clipboard

add support for data file size limit

Open mattsains opened this issue 8 months ago • 5 comments

closes #928

mattsains avatar Mar 24 '25 21:03 mattsains

Hi @mattsains. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot avatar Mar 24 '25 21:03 k8s-ci-robot

@fuweid , would you be able to take a look at this PR?

mattsains avatar Mar 27 '25 16:03 mattsains

Investigating the test-windows failure, I discovered that the db.allocate function can actually cause the file size to grow through the truncate in the mmap function. Should I change the allocate function to call grow if a request to allocate beyond the end of the file is received, or should I just duplicate the max size check in allocate?

specifically, this line here causes the file size to grow: https://github.com/etcd-io/bbolt/blob/main/db.go#L1168

mattsains avatar Mar 27 '25 23:03 mattsains

@fuweid , I've added a test for making sure the db can open even if it's beyond the configured max size. I've also got all the builds to pass, so I think this is ready for a review and potentially approval

mattsains avatar Mar 28 '25 18:03 mattsains

hi @fuweid , I am awaiting feedback from you for this PR

mattsains avatar Apr 04 '25 18:04 mattsains

Thanks for the PR.

My thoughts,

  • This PR might be affecting the existing db file, which already reaches the max size. So it's a breaking change.
  • Also it updates multiple places, which complicates the implementation.
  • It's hard to accurately control the max size; instead It should be a best-effort solution.
    • FYI. etcd implements similar max size (quota) controlling using flag --quota-backend-bytes , which is best effort. Also it won't lose data (return error after persisting the data)
    • Also there may be hundreds of etcd's transactions in each bbolt's transaction. etcd controls the db quota on etcd's transaction granularity instead of bbolt's transaction level.

Overall, I agree that it's a valid requirement. However, I don't feel safe to support it in bbolt. Instead, I'd suggest to add this feature in your application, and as a best-effort approach.

ahrtr avatar Apr 07 '25 10:04 ahrtr

Could you explain what you mean by:

This PR might be affecting the existing db file, which already reaches the max size. So it's a breaking change.

The max file size does not take effect unless (1) you turn it on in the configuration, and (b) you write to a database in a way that causes it to grow

Also it updates multiple places, which complicates the implementation.

I am happy to change it to only one location, but it seems to me that there are many places in the code where the size of the file is unintentionally grown, for example when mmaping the file on windows

It's hard to accurately control the max size; instead It should be a best-effort solution.

What makes it hard? I think this PR does it, does it not? Am I missing some complexity?

I'd suggest to add this feature in your application, and as a best-effort approach.

I am really trying to make this maximum feature not be best effort, which is why I created this PR

Look forward to your comments, @ahrtr

mattsains avatar Apr 07 '25 16:04 mattsains

@ahrtr hi there, I think this is ready for re-review

mattsains avatar Apr 15 '25 15:04 mattsains

@ahrtr I've updated the PR based on your comments

mattsains avatar Apr 17 '25 18:04 mattsains

I see that you use assert in the test cases. I think require is better at most of the places. The rule is: if it doesn't make sense to continue to execute the test if a check fails, then use require; otherwise use assert.

  • github.com/stretchr/testify/assert
  • github.com/stretchr/testify/require

ahrtr avatar Apr 18 '25 09:04 ahrtr

@ahrtr I've made the changes you requested and audited my use of require vs assert. This PR is ready for re-review

mattsains avatar Apr 18 '25 16:04 mattsains

Overall looks good to me. Can you please squash the commits?

@fuweid @tjungblu PTAL, thx

ahrtr avatar Apr 21 '25 08:04 ahrtr

I've squashed the commits

mattsains avatar Apr 21 '25 16:04 mattsains

@mattsains please rebase to fix CI issue. thanks!

fuweid avatar Apr 24 '25 19:04 fuweid

@fuweid rebased. Do we need the lgtm tag and/or assigning an approver?

mattsains avatar Apr 24 '25 19:04 mattsains

@mattsains It needs @ahrtr 's approval. :)

fuweid avatar Apr 24 '25 20:04 fuweid

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, fuweid, mattsains

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Apr 25 '25 16:04 k8s-ci-robot