bee icon indicating copy to clipboard operation
bee copied to clipboard

API for localstore scanning

Open nikipapadatou opened this issue 1 year ago • 3 comments

A new tool has been implemented which scans a localstore folder for corrupted and invalid chunks versus total amount of chunks per file belonging to pinned content.

The tool is here: https://github.com/ethersphere/bee/tree/feat-integrity-cmd

We need an API to serve this tool to the users, along with its relevant documentation:


  • go run $(pwd)/cmd/bee db validate-pin --data-dir /path/to/localstore This will produce a csv file with all the pins and their stats.

To select the ones that are problematic run:

  • awk '$1 >0 || $2>0' address.csv > invalid.csv The file will contain the addresses of pins that the user might want to unpin/re-upload. The original files can be found by their hash.

nikipapadatou avatar Feb 05 '24 08:02 nikipapadatou

I simply expanded the /pins/{reference} to include detailed information on the chunks involved in the pin set.

curl http://192.168.10.36:11633/pins/0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc | jq
{
  "reference": "0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc",
  "chunkCount": 5,
  "chunks": [
    {
      "address": "0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "4bd95ce93becc5e9fc1bf3f2304c7b2863bbbc9479fe248f319a809faf166a60",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "57b2c67a935478cbd5b72332a942751a9536931e51e73a00a90e009b9ea959c1",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "9b7f230858aabf6cf3e7b24effe56f821fcbe4989d3b12652865a83e5ad65d85",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "ec38e1764ec149a409a5bd1d4cd4e8c63d6148b00a899f481e954f0c230e94ac",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    }
  ]
}

My hacked API also includes the option to ?repair=true that will do some attempt at repairing the refCnt for local chunks. https://github.com/ldeffenb/bee/blob/a3437b45cf0dfcad18a6817c22d352a17557a217/pkg/api/pin.go#L188 And the structures just before that. It also requires various hacks in the database layers to get the information, but they're all in my 1.18.2-cumulative-hacks branch.

ldeffenb avatar Feb 05 '24 13:02 ldeffenb

Here's an example of a multi-chunk pinned reference with a chunk missing in the middle. Notice the err: true and local: false on the next to last chunk. If it's pinned, it should be local, obviously.

curl http://192.168.10.36:11633/pins/79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e | jq
{
  "reference": "79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e",
  "chunkCount": 4,
  "chunks": [
    {
      "address": "29e7f4854c447ea7a86fa161e6a59c06f82f2fa1a9b75cdf375c26c7689de3f3",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "da99078b366a86b5845e82388cdbc4fb31003155a6cc74c8d3f3afb8984583f4",
      "err": true,
      "refCnt": 0,
      "local": false,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "eccaf183984bb9b4f75367048c39de4716da62c68f4ca491aea650ae8ba53486",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    }
  ]
}

ldeffenb avatar Feb 05 '24 14:02 ldeffenb

Oh, and I also have hacked the /pins API itself to handle a ?limit=L&offset=O similar to what /tags has. Otherwise, I can't even query all of my pins, the node runs out of memory and panics.

ldeffenb avatar Feb 05 '24 14:02 ldeffenb