goleveldb icon indicating copy to clipboard operation
goleveldb copied to clipboard

leveldb: throttle manifest rebuilding

Open qianbin opened this issue 2 years ago • 6 comments

It fixes the problem like #413 .

qianbin avatar Jul 26 '22 03:07 qianbin

Hey @qianbin I was checking your fix #409 and I am wondering why it can make difference.

In your fix, it avoids recording some "Added" sstables twice. But in recovery procedure, these duplicated "Added" sstables will not affect the correctness.

// New tables.
for _, r := range r.addedTables {
	scratch := p.getScratch(r.level)
	if scratch.added == nil {
		scratch.added = make(map[int64]atRecord)
	}
	scratch.added[r.num] = r
	if scratch.deleted != nil {
		delete(scratch.deleted, r.num)
	}
}

Could you elaborate how does it fix the issue? And also theoretically we can delete the stale MANIFESTs in runtime to save storage right?

rjl493456442 avatar Jul 27 '22 03:07 rjl493456442

@rjl493456442 you're right. The manifest file is correctly written without duplicated records.

The problem is about table reference count. Here https://github.com/syndtr/goleveldb/blob/126854af5e6d8295ef8e8bee3040dd8380ae72e8/leveldb/session_util.go#L420 if we pass the newly committed rec object, it will be filled with all tables of v, and later when calling setVersion, wrong vDelta is sent, which causes wrong table reference count.

And also theoretically we can delete the stale MANIFESTs in runtime to save storage right?

I think it's a good idea.

qianbin avatar Jul 27 '22 04:07 qianbin

@qianbin Ah, right. Thanks for the explanation.

rjl493456442 avatar Jul 27 '22 04:07 rjl493456442

the last commit checks the limit on db open.

@StephenButtolph it's reverted to use for loop to double the limit.

qianbin avatar Jul 28 '22 05:07 qianbin

I think there's probably still some edge cases here when the compressed manifest size is close to (but not over) s.maxManifestFileSize. Which may cause the manifest to be written frequently. Although I'm not extremely familiar with how the manifest grows.

Specifically if:

  • The compressed file is just under s.maxManifestFileSize
  • A few operations end up increasing the (now not optimally compressed) manifest file size
  • The file is then re-written, with a new file that whose size is still under s.maxManifestFileSize.
  • Repeat until eventually the compressed manifest is actually larger than s.maxManifestFileSize.

The way I'd have thought to handle this is just doubling s.maxManifestFileSize every time the file gets re-written, regardless of the new size.

StephenButtolph avatar Aug 01 '22 16:08 StephenButtolph

@StephenButtolph Yes, the edge case can happen.

The way I'd have thought to handle this is just doubling s.maxManifestFileSize every time the file gets re-written, regardless of the new size.

This is a solution.

qianbin avatar Aug 01 '22 17:08 qianbin