fabric icon indicating copy to clipboard operation
fabric copied to clipboard

Added logic to divide and save block files by specific number

Open one-percent-of opened this issue 3 years ago • 5 comments

When the fabric nodes (peer, orderer) have a lot of blocks, too many files are created in one directory. In a specific file system, if there are too many files in one directory, there is an issue that the speed of reading files becomes slow. These are blockfile_xxxxxx files in the '/hyperledger/production/ledgersData/chains/chains/' directory.

Is there any way to solve this? Or can the Hyperledger Fabric foundation update the related logic?

It would be nice to add 'logic to divide and save the directory when a specific number or a specific size of block files in one directory is reached'.

one-percent-of avatar Jan 07 '22 06:01 one-percent-of

@manish-sethi

yacovm avatar Jan 07 '22 22:01 yacovm

@one-percent-of - Can you add some details about the OS that you are using, the number of files, total size and number of blocks. More importantly, what fabric operation you see slowing down (and how much) as an end user?

In the code, we cut a block file when it reaches 64MB size. So, increasing the file size should be possible. I would encourage you to change this size to a larger one and experiment in your set up (OS) and update if you see any improvement?

manish-sethi avatar Jan 10 '22 14:01 manish-sethi

@manish-sethi The OS is linux (centos7). As the file system, the XFS file system is used.

Currently, the number of blocks is about 30 million.

Currently, the number of blockfile_xxxxxx files in the '/hyperledger/production/ledgersData/chains/chains/' directory is about 12000.

The capacity is about 755 GB.

I saw an article saying that when the number of files in a directory reaches about 15000, the performance of os reading files decreases.

I am not an os expert. So I don't trust that statement 100%. However, the current fabric is constantly creating block files in one directory. If it continues to increase in this state, I think that one day the number of files will become very large and the performance of the os will deteriorate.

The work of the fabric itself does not slow down, but I think the performance of the server (os) running the fabric node (peer, orderer) will be a problem.

Therefore, it is suggested that 1) the size of the block file can be set to be larger than 64MB, or 2) if there are 10000 block files in one directory like log rotation, it is proposed to change the block file to be stored in the next directory.

In terms of OS, can you tell me if the performance is not a problem even if the XFS filesystem has up to how many files in one directory?

Please correct me if I have any incorrect information. Or, if there is something I need to check, please let me know.

one-percent-of avatar Jan 11 '22 08:01 one-percent-of

Is there a measurable impact to Fabric that would warrant a change? Please provide specific measured impact.

denyeart avatar Jan 31 '22 13:01 denyeart

Hi, We use centos7 OS and XFS File system This is Hardware Spec. Filesystem Size Used Avail Use% /dev/sdc3 414G 12G 403G 3% devtmpfs 47G 0 47G 0% tmpfs 47G 0 47G 0% tmpfs 47G 2.6G 45G 6% tmpfs 47G 0 47G 0% /dev/loop0 1.5M 1.5M 0 100% /dev/sdd1 7.0T 934G 6.1T 14% /dev/loop1 91M 91M 0 100% /dev/loop2 92M 92M 0 100% /dev/sdc2 494M 202M 293M 41% /dev/sdc1 500M 12M 489M 3% tmpfs 9.4G 12K 9.4G 1% tmpfs 9.4G 0 9.4G 0% overlay 7.0T 934G 6.1T 14% overlay 7.0T 934G 6.1T 14% overlay 7.0T 934G 6.1T 14% tmpfs 9.4G 0 9.4G 0% tmpfs 9.4G 0 9.4G 0%

I've read a few papers about XFS File System. In this paper, it is said that we can have many files in one directory in the xfs file system, but I wonder if there will be no issue in terms of performance even if block files are continuously generated in the future within our hardware resources. In the paper, I checked that the method of managing the number of block files within the XFS file system does not statically allocate the Inodes of files, but creates an Allocation Group and dynamically allocates them as needed to manage them internally. I understood that the number of files can continue to be created dynamically within a directory, So Does this mean that we don't have to worry about performance issues as the number of block files increases?

traveloving2030 avatar Apr 28 '22 03:04 traveloving2030