pogreb icon indicating copy to clipboard operation
pogreb copied to clipboard

Large database truncate problem

Open Kleissner opened this issue 4 years ago • 3 comments

We are recently running into this problem which prevents the database from growing. Every time we call db.Put we get this error message:

truncate D:\Database\main.pix: The requested operation could not be completed due to a file system limitation

The whole database folder is 255 GB. The file main.pix is 37.3 GB of size. Running on Windows Server 2019 as admin and the disk has plenty of storage (4 TB total).

Any idea of the root cause and how to fix it?

I suppose the error message origins from here? https://github.com/akrylysov/pogreb/blob/e182fb02fbd270cf4943430543d6d2e3824c6682/file.go#L79-L86

Edit: Unrelated to this problem, but in truncate used by recoveryIterator.next it uses uint32. That could lead to problems down the road for large segment files? https://github.com/akrylysov/pogreb/blob/e182fb02fbd270cf4943430543d6d2e3824c6682/file.go#L97-L107

Kleissner avatar Jan 16 '21 20:01 Kleissner

Could it be file fragmentation?

Googling this message finds this: https://support.assurestor.com/support/solutions/articles/16000104076-the-requested-operation-could-not-be-completed-due-to-a-file-system-limitation

  1. Compressed files are more likely to reach the limit because of the way the files are stored on disk. Compressed files require more extents to describe their layout. Also, decompressing and compressing a file increases fragmentation significantly.
  2. The limit can be reached when write operations occur to an already compressed chunk location. The limit can also be reached by a sparse file. This size limit is usually between 40 gigabytes (GB) and 90 GB for a very fragmented file.
  3. A heavily fragmented file in an NTFS file system volume may not grow beyond a certain size caused by an implementation limit in structures that are used to describe the allocations.

Kleissner avatar Jan 16 '21 20:01 Kleissner

Thanks for the bug report.

Unrelated to this problem, but in truncate used by recoveryIterator.next it uses uint32. That could lead to problems down the road for large segment files?

Segment files can't exceed 4GiB https://github.com/akrylysov/pogreb/blob/master/options.go#L40. The max segment size currently is not configurable and is always set to 4GiB.

main.pix is the main index file. Index files use 64-bit offsets: https://github.com/akrylysov/pogreb/blob/cc107cdd2f78d7ca0ec33e853c4480a9a43e7472/index.go#L89

Windows support could definitely use more testing. I develop Pogreb on macOS and deploy it to Linux.

I'll try to reproduce the issue. Wondering if it's related to mmap? I'm working on adding an option to disable mmap.

akrylysov avatar Jan 17 '21 01:01 akrylysov

I can reproduce the error - anytime db.Put gets called it always fails. I added debugging code and confirm that the referenced extend function fails on this line in file.go:

if err := f.Truncate(off + int64(size)); err != nil {

I've added logging:

		fmt.Printf("Error offset %d size %d from f.Trunacte: %s\n", off, size, err.Error())

And the output is always:

Error offset 40108773376 size 512 from f.Trunacte: truncate D:\Database\main.pix: The requested operation could not be completed due to a file system limitation
Error offset 40108773376 size 512 from f.Trunacte: truncate D:\Database\main.pix: The requested operation could not be completed due to a file system limitation
Error offset 40108773376 size 512 from f.Trunacte: truncate D:\Database\main.pix: The requested operation could not be completed due to a file system limitation

The offset number is in sync with the file size (37.3 GB). I tried the defragmentation tool of Windows without success (I assume due to SSD it actually didn't defrag).

Then I tried another trick - copying main.pix to a new file, deleting old one, and renaming the new one to original name. It worked! 🎉

So it looks like the underlying error is that when you extend it by 512 times NTFS extends it by millions of chunks (instead of consecutive data) - and at some point it hits an OS internal limit. I will monitor the situation and check if it fails again in 40 GB (which might take weeks).

I guess an ugly fix would be catching that error and then temporarily closing the file, and doing what I did manually - copy, delete old, rename, open.

Microsoft documented the problem here: https://support.microsoft.com/en-in/help/967351/a-heavily-fragmented-file-in-an-ntfs-volume-may-not-grow-beyond-a-cert

Kleissner avatar Jan 17 '21 03:01 Kleissner