aws-sdk-js icon indicating copy to clipboard operation
aws-sdk-js copied to clipboard

Query JSON content from S3 not working with where clause

Open IIslam opened this issue 2 years ago • 4 comments

Describe the bug

I'm trying to query some json content from s3 bucket it works fine when I select * from the bucket but once I specify LIMIT id do nothing returns whole data and when specify WHERE it returns empty response.

  • data [ { "id": 1, "name": "abcd", "dataIds": [ 2, 1, 2 ], "attributes": [ "name", "picUrl", "email", "linkedinUrl", "job" ] }, { "id": 2, "name": "zxy", "dataIds": [ 2, 1, 1 ] } ]

  • aws s3api select-object-content --bucket e-commerce-uploads --key Content.json --expression "select * from s3object data where data.id = 1" --expression-type 'SQL' --input-serialization '{"JSON": {"Type":"DOCUMENT"}, "CompressionType": "NONE"}' --output-serialization '{"JSON": {}}' "json_output.json"

Expected Behavior

When the query is like

SELECT * from s3object data where data.id = 1 I should get

{ "id": 1, "name": "abcd", "dataIds": [ 2, 1, 2 ], "attributes": [ "name", "picUrl", "email", "linkedinUrl", "job" ] }

or even when specify limit like this

select * from s3object limit 1 I should get one object

{ "id": 1, "name": "abcd", "dataIds": [ 2, 1, 2 ], "attributes": [ "name", "picUrl", "email", "linkedinUrl", "job" ] }

Current Behavior

returns empty file.

Reproduction Steps

aws s3api select-object-content --bucket e-commerce-uploads --key Content.json --expression "select * from s3object data where data.id = 1" --expression-type 'SQL' --input-serialization '{"JSON": {"Type":"DOCUMENT"}, "CompressionType": "NONE"}' --output-serialization '{"JSON": {}}' "json_output.json"

aws s3api select-object-content --bucket e-commerce-uploads --key Content.json --expression "select * from s3object data LIMIT 1" --expression-type 'SQL' --input-serialization '{"JSON": {"Type":"DOCUMENT"}, "CompressionType": "NONE"}' --output-serialization '{"JSON": {}}' "json_output.json"

** DB.js**

import { S3 } from '@aws-sdk/client-s3';

class S3DB {
    constructor(s3Region) {
        this.region = s3Region;
        this.client = new S3({ region: s3Region });
    }
    query(queryString, bucketName, fileName) {
        console.log('query string', queryString);
        return new Promise((resolve, reject) => {
            this.client.selectObjectContent({
                Bucket: bucketName,
                Key: fileName,
                ExpressionType: 'SQL',
                Expression: queryString,
                InputSerialization: {
                    JSON: {
                        Type: "DOCUMENT"
                    },
                    CompressionType: 'NONE'
                },
                OutputSerialization: {
                    JSON: {
                        RecordDelimiter: ' '
                    }
                }
            }).then(async ({ $metadata, Payload }) => {
                console.log("meta data===>", $metadata);
                console.log("Payload", Payload);
                let array = [];
                for await (const data of Payload) {
                    console.log(data);
                    if (data.Records)
                        array.push(Buffer.from(data.Records.Payload).toString('utf8'));
                }
                console.log('result array', array);
            });

        });

    }
}
export default S3DB;

handler

const bucketName = "e-commerce-uploads";
const fileName = "Content.json";
const region = 'eu-west-1';

  const db = new DB(region);
  db.query(`SELECT * FROM S3Object[*] AS data WHERE id = 1`, bucketName, fileName)
    .then(data => {
      console.log('query data', data);
    }).catch(err => {
      console.log('query error', err);
    });

Possible Solution

No response

Additional Information/Context

No response

SDK version used

"@aws-sdk/client-s3": "^3.110.0" aws-cli/2.7.3 Python/3.9.11 Windows/10 exe/AMD64 prompt/off

Environment details (OS name and version, etc.)

Windows 10

IIslam avatar Jun 16 '22 08:06 IIslam

@IIslam thanks for opening this issue, I see you are using V3 of the SDK, can you share what error you get when you run selectobjectcontent operation.

ajredniwja avatar Jun 17 '22 13:06 ajredniwja

Hi @ajredniwja I didn't get any errors but I get no data when I add the where clause

IIslam avatar Jun 17 '22 19:06 IIslam

Hi @ajredniwja I'm waiting for your reply.

IIslam avatar Jun 27 '22 08:06 IIslam