local-data-api icon indicating copy to clipboard operation
local-data-api copied to clipboard

Return Postgres Arrays as Type 'arrayValue'

Open ormu5 opened this issue 3 years ago • 6 comments

I know there's some other discussion regarding Postgres arrays but this seemed to warrant a separate ticket.

Currently if I retrieve a column from Postgres of type _text (text array), the returned metadata for the column using local-data-api looks like:

{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'container_type', 'name': 'container_type', 'nullable': 0, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'required_attributes', 'type': 12, 'typeName': 'text'}

Notice 'typeName' returning text instead of _text. When deployed on AWS, Data API returns for the same column:

{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'attributes', 'name': 'attributes', 'nullable': 1, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'required_attributes', 'type': 2003, 'typeName': '_text'}

Notice 'typeName' indeed reflects the Postgres type _text.

This behavior continues with respect to the payload returned from the database, where local-data-api will return the value of the column as

{'stringValue': '{Volume}'}

where 'Volume' is the single item in the Postgres array. Meanwhile on AWS, Data API will return the following as value for the column:

{'arrayValue': {'stringValues': ['Volume']}}

Data API's awareness of Postgress arrays seems to be better during read than during write!

I'm using:

  • boto3
  • Postgres 10.x
  • Python 3.7

In code terms, the difference is captured in the two functions I maintain for local versus deployed Data API:

# Local Data API emulation currently returns arrays from database differently than
# AWS Data API, so let's establish at instantiation which function is doing our parsing.
def parse_array_local(field: Dict):
    """When running locally, parse string representation of array/set."""

    return next(iter(field.values())).lstrip('{').rstrip('}').split(',')


def parse_array_deployed(field: Dict):
    """When running on AWS, return nicely formatted - albeit nested - list."""

    return next(iter(field.values()))['stringValues']

Thank you, again. -jeff

ormu5 avatar Apr 21 '21 22:04 ormu5

@jeffster23 Thank you for explaining about the array. I investigated about array in data-api yesterday. It's difficult to insert array values via aws-sdk or boto3. And, I stopped it. :( How did you insert the data in Postgres?

I can implement that returning array value. If the feature is worth it, I will do it.

But, inserting array value is too hard. data-api cast some data in an array... For example, 123 is int4 12356789012345 is int8

JDBC requires the DB type when creating an array object. I have to implement a logic to detect DB-type for array value. I don't know the best way 🤔

koxudaxi avatar Apr 22 '21 02:04 koxudaxi

From what I've seen, I'm not sure it makes sense to implement the insert functionality for arrays in local-data-api since Data API does not itself support this, to my knowledge. Providing more functionality than the actual AWS Data API would be as bad - actually, maybe worse! - as not providing as much.

For reading arrays, though, Data API seems to have more awareness and it would make sense to match it, especially since the response payload structure differs right now between local-data-api and AWS Data API for Postgres arrays, where a first char of '_' in Postgres type name indicates array (e.g., _text, _int4, etc.).

Since you are likely to get inquiries related to inserting arrays, though, you could always include some guidance in your documentation, even if local-data-api itself does not address it. As for me, I accomplish this insertion (with both local-data-api and AWS Data API) by constructing SQL as, e.g.:

insert into cfg.attributes(id, type, attributes)
values(:id, :type, cast(:attributes as text[]));

where the value for attributes parameter is a string properly formatted as Postgres array, e.g., '{"Volume","Material"}'.

ormu5 avatar Apr 22 '21 02:04 ormu5

@jeffster23

From what I've seen, I'm not sure it makes sense to implement the insert functionality for arrays in local-data-api since Data API does not itself support this, to my knowledge. Providing more functionality than the actual AWS Data API would be as bad - actually, maybe worse! - as not providing as much.

I agree.

Since you are likely to get inquiries related to inserting arrays, though, you could always include some guidance in your documentation, even if local-data-api itself does not address it. As for me, I accomplish this insertion (with both local-data-api and AWS Data API) by constructing SQL as, e.g.:

It's a great suggestion. The first milestone is to support the array on the return value, and I will copy your sample to documents. Also, If DataAPI will support inserting array value then local-data-api will support inserting array value as the second milestone.

Thank you very much 😄

koxudaxi avatar Apr 22 '21 02:04 koxudaxi

@ormu5 I have released a new version as 0.6.10. Would you please check it?

koxudaxi avatar Apr 24 '21 15:04 koxudaxi

Busy week this week, happy to look at this soon, though.

ormu5 avatar Apr 28 '21 13:04 ormu5

This looks good! I tried with text/int/float arrays in Postgres and received the following correct column metadata:

'columnMetadata': [
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'name', 'name': 'name', 'nullable': 0, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 12, 'typeName': 'text'}, 
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'attributes', 'name': 'attributes', 'nullable': 1, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_text'}, 
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': False, 'isCurrency': False, 'isSigned': True, 'label': 'testing_int_array', 'name': 'testing_int_array', 'nullable': 1, 'precision': 10, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_int4'}, 
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': False, 'isCurrency': False, 'isSigned': True, 'label': 'testing_float_array', 'name': 'testing_float_array', 'nullable': 1, 'precision': 17, 'scale': 17, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_float8'}
]

Note 'typeName' having notation of '_'.

And corresponding payload returned looks good, too, a couple of my test records being:

[{'stringValue': 'Thing'}, {'arrayValue': {'stringValues': ['Attribute']}}, {'arrayValue': {'longValues': [1, 3]}}, {'arrayValue': {'doubleValues': [1.231, 3.231]}}], 
[{'stringValue': 'Another Thing'}, {'arrayValue': {'stringValues': ['Attribute']}}, {'arrayValue': {'longValues': [1, 3]}}, {'arrayValue': {'doubleValues': [1.231, 3.231]}}]

Note arrays correctly in 'arrayValue' object whose nested object correctly has Data API type for key and value of properly-parsed values as list.

ormu5 avatar May 02 '21 19:05 ormu5