gosdk icon indicating copy to clipboard operation
gosdk copied to clipboard

Descriptive details on allocation repair

Open lpoli opened this issue 2 years ago • 8 comments

We have a consensus mechanism to accept some operation as okay in client side(client needs to handle it).

The basic flow is, First we send operation request to blobbers and it will store it temporarily. If consensus is met then we send commit request to succeeding blobbers. This commit request will actually apply the changes requested in first step. Commit request will have writemarker as well.

To consider some operation as okay, we need to have successful data + 1 number of succeeded operations. So for example if we have an allocation with 10 data shards and 6 parity shards, then as long as any file operation in 11 shards succeeds then we consider it as successful operation. Now when this happens allocation is in need of repair because we want to maintain the parity shards so data is as much recoverable as possible.

Now consider a rename operation on file. Say all rename operation request in all 16 blobbers succeeded. Then we move to second step to commit. If 8 blobbers succeeded and 8 blobbers failed then the operation is considered as failed but for the file 8 have old name and 8 have new name. Now for a client this is like a file lost.

In the case above, we try to undo the change instantly to minimize failure affect allocation. The following are the undo for different operation:

  1. Upload --> Delete file in blobbers
  2. Update --> Blobber should also hold original content temporarily and undo can be done
  3. Rename --> Re-rename succeeding blobber's file to old name
  4. Copy --> Delete file
  5. Move --> Similar to rename. This operation is combination of (copy + delete). We should make this atomic.
  6. Delete --> There is no undo. Client cannot find this file anyway, but repair should delete it.

We would require following implementation to achieve proper repair functionality.

FileMetaRoot

It is the calculation of all the hashes of file meta data. FileMetaRoot unlike allocation root will be same across all blobbers. The following fields of file meta should be included:

  1. Path
  2. Type
  3. Size
  4. ActualFileSize
  5. ActualFileHash
  6. Child hash (hash of children's metadata hash)

FileMetaRoot should be calculated like allocationRoot.

Note: FileMetaRoot will be used to know if repair is required

Writemarker Modification

It is also important to know what operation was done so that FileMetaRoot differed from other blobber's FileMetaRoot. We need to add operation field in Writemarker. New writemarker would be like:

type WriteMarker struct {
	AllocationRoot         string           
	PreviousAllocationRoot string
	AllocationID           string        
	Size                   int64            
	BlobberID              string       
	Timestamp              common.Timestamp 
	ClientID               string           
	Signature              string         

	Path  string
        FileID int // Some file id that will be unique for its lifetime
        Operation  string
}

FileID seems to be a must. Otherwise we would not know which file path to operate for.

Having fileMetaRoot and changes in Writemarker, We can list writemarkers of each blobber and follow until we find same fileMetaRoot of all blobbers and we can start to repair from that point.

If one blobber has fileMetaRoots as h1, h2, h3, h4, h5. and other has h1, h2, h3 then with repair this other blobber should first attain h4 and then h5. However there will be some exceptions. For example, if rename is left to be done in one blobber but later on it is also deleted then we don't need to rename. We can directly delete that file. Again fileID is important for repair to be functional.

Note: Writemarker changes will make GDPR reporting accurate and effective

lpoli avatar Sep 12 '22 05:09 lpoli

The idea of fileID seems at odds with a distributed system imo. A file could be uploaded, renamed, moved, updated etc. At what point is it no longer the same file?

For move or rename operations, it is just metadata change, so it has very little impact to temporarily accomodate more than one instance. In this case, knowing the file content hash can be useful, regardless of its current path, albeit an allocation could have multiple instances of an identical file.

But for file updates, the impact on Blobbers is potentially significant. A fairer way of doing this could be incremental versions of a file, the user pays for both versions until it is satisfied that it has sufficient consensus to delete the old version.

Also consider also the example of higher EC, like 10/30. There could be an instance where 11 data parts are successfully updated but 19 have not.

sculptex avatar Sep 13 '22 20:09 sculptex

@sculptex Your point itself directs us towards having fileID.

Think about this, If there are two operations rename and update on a file then it will not be repairable because file is kind of lost in between two writemarkers. Also we cannot rely on content hash because as you said there can be identical content and we would not know which file to update.

FileID should be provided from client-side and it should be included in writemarker root hash calculation as well.

lpoli avatar Sep 14 '22 03:09 lpoli

Sure, if fileID is a client provided value that could work, but think of sync scenario where same client uploading from multiple points, there could be clash. One option though could be that Id is random rather than sequential?

sculptex avatar Sep 14 '22 13:09 sculptex

I agree that concurrent updates from different devices would be a concern.

Any worry about the size of the random FileID? If so, another option might be to assign a prefix to each device, so that there would be no clashes if files were being created from two different devices from the same client.

taustin avatar Sep 15 '22 08:09 taustin

Even a random 32-bit fileID would give negligible chance of collision. (If already exists then another chosen).

The device prefix (well, a random start point for each device that would be incremented from) was my first thought, but then that fileID counter has to be stored somewhere.

Also consider simultaneous sync from 2 devices with same file path there could be two competing instances of the same file, each with a different fileID.

sculptex avatar Sep 15 '22 09:09 sculptex

I think writemarker locking mechanism can help here to simply use incremental fileID starting from number 1. Also we are only concerned while uploading files and creating directory. For other file operations we use existing fileID.

lpoli avatar Sep 15 '22 17:09 lpoli

@taustin So are we good to go?

lpoli avatar Sep 16 '22 02:09 lpoli

So the idea is to use the writemarker locking mechanism to make sure that we don't have conflicts on the fileID numbering? If so, I'm good with that.

taustin avatar Sep 16 '22 18:09 taustin

Use repair to add/replace blobber

guruhubb avatar Mar 23 '23 06:03 guruhubb

duplicates two phase commit

dabasov avatar Apr 17 '23 18:04 dabasov