ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-10634. Recon - listKeys API for listing of OBS , FSO and Legacy bucket keys with filters.

Open devmadhuu opened this issue 10 months ago • 6 comments

What changes were proposed in this pull request?

This PR adds a new API in Recon for listing keys for OBS buckets, Legacy buckets with filters and recursively in a flat structure for FSO buckets.

New API:

api/v1/namespace/listKeys?startPrefix=/volume1/obs-bucket/&count=105

Default values of API parameters if not provided:

1. replicationType - empty string and filter will not be applied, so list out all keys irrespective of replication type.
2. creationTime - empty string and filter will not be applied, so list out keys irrespective of age, else list out keys which got created on or after provided creationTime
3. keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all.
4. startPrefix - /
5. count - 1000

Behavior of API: For OBS bucket - list out count number of keys on the provided path. This API will implement pagination support using count params.

Get List of All Keys: GET /api/v1/namespace/listKeys

 API params:
  1. replicationType - Filter for RATIS or EC replication keys
  2. creationDate in "MM-dd-yyyy HH:mm:ss" string format.
  3. startPrefix
  4. count
  5. keySize
  6. recursive - recursive listing out keys recursively for FSO buckets.

 **Input Request for OBS bucket:**

       `api/v1/namespace/listKeys?startPrefix=/volume1/obs-bucket&count=105`

 **Output Response:**

  ```
    {
        "status": "OK",
        "path": "/volume1/obs-bucket",
        "size": 73400320,
        "sizeWithReplica": 81788928,
        "subPathCount": 7,
        "totalKeyCount": 7,
        "lastKey": "/volume1/obs-bucket/key7",
        "subPaths": [
            {
                "key": true,
                "path": "key1",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680854675,
                "modificationTime": 1712680855695
            },
            {
                "key": true,
                "path": "key1/key2",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680857753,
                "modificationTime": 1712680858666
            },
            {
                "key": true,
                "path": "key1/key2/key3",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680860801,
                "modificationTime": 1712680861870
            },
            {
                "key": true,
                "path": "key4",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680863937,
                "modificationTime": 1712680864899
            },
            {
                "key": true,
                "path": "key5",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680866996,
                "modificationTime": 1712680868187
            },
            {
                "key": true,
                "path": "key6",
                "size": 10485760,
                "sizeWithReplica": 10485760,
                "isKey": true,
                "replicationType": "RATIS",
                "creationTime": 1712680870182,
                "modificationTime": 1712680871044
            },
            {
                 "key": true,
                 "path": "key7",
                 "size": 10485760,
                 "sizeWithReplica": 18874368,
                 "isKey": true,
                 "replicationType": "EC",
                 "creationTime": 1713262187049,
                 "modificationTime": 1713262188135
           }
        ],
        "sizeDirectKey": 73400320
    }

**Input Request for FSO bucket:**

           `api/v1/namespace/listKeys?startPrefix=/volume1/fso-bucket&recursive=true`

**Output Response:**

      ```
      {
                "status": "OK",
                "path": "/volume1/fso-bucket",
                "size": 62914560,
                "sizeWithReplica": 188743680,
                "subPathCount": 6,
                "totalKeyCount": 6,
                "lastKey": "/-9223372036854775552/-9223372036854775040/-9223372036854774525/testfile",
                "subPaths": [
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/file1",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680835581,
                        "modificationTime": 1712680836508
                    },
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/testfile",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680832118,
                        "modificationTime": 1712680833528
                    },
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/dir2/file1",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680841158,
                        "modificationTime": 1712680842040
                    },
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/dir2/testfile",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680838434,
                        "modificationTime": 1712680839254
                    },
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/dir2/dir3/file1",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680847287,
                        "modificationTime": 1712680850660
                    },
                    {
                        "key": true,
                        "path": "volume1/fso-bucket/dir1/dir2/dir3/testfile",
                        "size": 10485760,
                        "sizeWithReplica": 31457280,
                        "isKey": true,
                        "replicationType": "RATIS",
                        "creationTime": 1712680843959,
                        "modificationTime": 1712680844890
                    }
                ],
                "sizeDirectKey": 0
            }

Input Request for LEGACY bucket:

       `api/v1/namespace/listKeys?startPrefix=/volume1/legacy-bucket

Output Response:

        {
            "status": "OK",
            "path": "/volume1/legacy-bucket",
            "size": 157286400,
            "sizeWithReplica": 157286400,
            "subPathCount": 6,
            "totalKeyCount": 6,
            "lastKey": "/volume1/legacy-bucket/key6",
            "subPaths": [
                {
                    "key": true,
                    "path": "key1",
                    "size": 10485760,
                    "sizeWithReplica": 10485760,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680878239,
                    "modificationTime": 1712680879179
                },
                {
                    "key": true,
                    "path": "key1/key2",
                    "size": 41943040,
                    "sizeWithReplica": 41943040,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680881331,
                    "modificationTime": 1712680882611
                },
                {
                    "key": true,
                    "path": "key1/key2/key3",
                    "size": 10485760,
                    "sizeWithReplica": 10485760,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680884664,
                    "modificationTime": 1712680885522
                },
                {
                    "key": true,
                    "path": "key4",
                    "size": 41943040,
                    "sizeWithReplica": 41943040,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680887558,
                    "modificationTime": 1712680888590
                },
                {
                    "key": true,
                    "path": "key5",
                    "size": 10485760,
                    "sizeWithReplica": 10485760,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680890644,
                    "modificationTime": 1712680891447
                },
                {
                    "key": true,
                    "path": "key6",
                    "size": 41943040,
                    "sizeWithReplica": 41943040,
                    "isKey": true,
                    "replicationType": "RATIS",
                    "creationTime": 1712680907002,
                    "modificationTime": 1712680908210
                }
            ],
            "sizeDirectKey": 157286400
        }

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10634

How was this patch tested?

Added Junit test cases and tested various assertions.

devmadhuu avatar Apr 09 '24 16:04 devmadhuu

  1. replicationType - RATIS
  2. creationTime - empty string and filter will not be applied, so list out keys irrespective of age, else list out keys which got created on or after provided creationTime
  3. keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all.
  4. startPrefix - /

I think default replication types should be all of them. This is consistent with the key size and create time filters which have no effect if no value is provided.

  1. count - 1000 ... This API will implement pagination support using count params.
  • How can we make sure the user knows the value is truncated and that there is not only 1000 keys in the prefix?
  • How is pagination implemented to tell the server where the next 1000 keys should start?

errose28 avatar Apr 09 '24 18:04 errose28

  1. replicationType - RATIS
  2. creationTime - empty string and filter will not be applied, so list out keys irrespective of age, else list out keys which got created on or after provided creationTime
  3. keySize - 0 bytes, which means all keys greater than zero bytes will be listed, effectively all.
  4. startPrefix - /

I think default replication types should be all of them. This is consistent with the key size and create time filters which have no effect if no value is provided.

  1. count - 1000 ... This API will implement pagination support using count params.
  • How can we make sure the user knows the value is truncated and that there is not only 1000 keys in the prefix?
  • How is pagination implemented to tell the server where the next 1000 keys should start?
  1. Agree, ReplicationType default value can be empty, so listing effectively all.
  2. Thinking of providing total count based on filters provided in response, which tells how many keys in the prefix.
  3. Still not decided the solution, but I was thinking to provide one more param -> offset. So e.g. if client provides offset as 0 and count as 10, then server will give first 10 records. And if client provides offset as 100 and count as 50, then server will skip first 100 records and provides next 50 records.

devmadhuu avatar Apr 10 '24 02:04 devmadhuu

I think just providing the last key in the list as a start key might be easier to resume pagination from than adding a count based offset.

errose28 avatar Apr 11 '24 20:04 errose28

@dombizita @ArafatKhan2198 @sodonnel @sumitagrawl Pls review.

devmadhuu avatar Apr 17 '24 09:04 devmadhuu

Since this endpoint /listKeys is part of the various NSSummaryEndpoints, can we introduce an integration test for the endpoint as well? We currently do not have one, and in our discussions, we had decided to add one in the future. For now, we can just test out the ListKeys feature. We could test it out for various bucket types, with various hierarchies and also test out the pagination part as well. The other methods part of NSSummaryEndpoint could be included in subsequent jira's later on.

ArafatKhan2198 avatar Apr 28 '24 20:04 ArafatKhan2198

Since this endpoint /listKeys is part of the various NSSummaryEndpoints, can we introduce an integration test for the endpoint as well? We currently do not have one, and in our discussions, we had decided to add one in the future. For now, we can just test out the ListKeys feature. We could test it out for various bucket types, with various hierarchies and also test out the pagination part as well. The other methods part of NSSummaryEndpoint could be included in subsequent jira's later on.

Ok, sure will add integration test.

devmadhuu avatar Apr 29 '24 14:04 devmadhuu