leofs
leofs copied to clipboard
[leo_object_storage][tool] Retrieve objects from AVS
especially for operation mistakes.
how to implement
As we already have a diagnose which is able to retrieve offsets to a metadata, body of every object, we can retrieve those based on offsets and extract into some other format which enable users to use easily for recovering.
for example,
$ leofs-avs-dump -o outdir -n 1024 # output outdir/${num} and each dir have files up to `-n`
### each object name are stored as a filename urlencoded
### each object body are stored as a file content
outdir/
├── 1
│ ├── urlencoded_file_path_1
│ ├── ...
│ └── urlencoded_file_path_1024
├── ...
└── N
├── urlencoded_file_path_other
├── ...
└── ...
This structure enable users using rest mode to recover files like
find outdir -type f|xargs -I % curl -X PUT http://leo_gateway:8080/bucket/$(basename %) --data-binary @%
Suggestion from @windkit cited from https://groups.google.com/d/msg/leoproject_leofs/tLgNlvK7Eps/N-c7a2XdDwAJ
another solution to recover files from AVS stored on a detached node.
prerequisites
- RINGs are versioned (now more or less has it, with the RING hash)
- Nodes commit its RING version to persistent storage
- Central storage of RING versions
procedure
- Crashed node back up
- Crashed node notice the change of RING version
- Crashed node compute the difference between its version and cluster version
- Crashed node push the changed object (steps 3 and 4 could also be done as Pull)
A brief pros/cons.
Impl | Complexity | Maintainability | Performance | Disk Space | Network |
---|---|---|---|---|---|
As an external tool derived from diagnose | Less | Great | Poor | 2x temporally | Consume bandwidth not only st <-> st but gw <-> st |
As a new feature of manager|storage | More | Low | Great | no extra space | Consume badwidth only st <-> st |
so it's kinda runtime effectiveness vs development effectiveness
The external is a must to me, and the later one should be extra.