local-data-api
local-data-api copied to clipboard
Return Postgres Arrays as Type 'arrayValue'
I know there's some other discussion regarding Postgres arrays but this seemed to warrant a separate ticket.
Currently if I retrieve a column from Postgres of type _text
(text array), the returned metadata for the column using local-data-api looks like:
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'container_type', 'name': 'container_type', 'nullable': 0, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'required_attributes', 'type': 12, 'typeName': 'text'}
Notice 'typeName' returning text
instead of _text
. When deployed on AWS, Data API returns for the same column:
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'attributes', 'name': 'attributes', 'nullable': 1, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'required_attributes', 'type': 2003, 'typeName': '_text'}
Notice 'typeName' indeed reflects the Postgres type _text
.
This behavior continues with respect to the payload returned from the database, where local-data-api will return the value of the column as
{'stringValue': '{Volume}'}
where 'Volume' is the single item in the Postgres array. Meanwhile on AWS, Data API will return the following as value for the column:
{'arrayValue': {'stringValues': ['Volume']}}
Data API's awareness of Postgress arrays seems to be better during read than during write!
I'm using:
- boto3
- Postgres 10.x
- Python 3.7
In code terms, the difference is captured in the two functions I maintain for local versus deployed Data API:
# Local Data API emulation currently returns arrays from database differently than
# AWS Data API, so let's establish at instantiation which function is doing our parsing.
def parse_array_local(field: Dict):
"""When running locally, parse string representation of array/set."""
return next(iter(field.values())).lstrip('{').rstrip('}').split(',')
def parse_array_deployed(field: Dict):
"""When running on AWS, return nicely formatted - albeit nested - list."""
return next(iter(field.values()))['stringValues']
Thank you, again. -jeff
@jeffster23 Thank you for explaining about the array. I investigated about array in data-api yesterday. It's difficult to insert array values via aws-sdk or boto3. And, I stopped it. :( How did you insert the data in Postgres?
I can implement that returning array value. If the feature is worth it, I will do it.
But, inserting array value is too hard. data-api cast some data in an array... For example, 123 is int4 12356789012345 is int8
JDBC requires the DB type when creating an array object. I have to implement a logic to detect DB-type for array value. I don't know the best way 🤔
From what I've seen, I'm not sure it makes sense to implement the insert functionality for arrays in local-data-api since Data API does not itself support this, to my knowledge. Providing more functionality than the actual AWS Data API would be as bad - actually, maybe worse! - as not providing as much.
For reading arrays, though, Data API seems to have more awareness and it would make sense to match it, especially since the response payload structure differs right now between local-data-api and AWS Data API for Postgres arrays, where a first char of '_' in Postgres type name indicates array (e.g., _text, _int4, etc.).
Since you are likely to get inquiries related to inserting arrays, though, you could always include some guidance in your documentation, even if local-data-api itself does not address it. As for me, I accomplish this insertion (with both local-data-api and AWS Data API) by constructing SQL as, e.g.:
insert into cfg.attributes(id, type, attributes)
values(:id, :type, cast(:attributes as text[]));
where the value for attributes
parameter is a string properly formatted as Postgres array, e.g., '{"Volume","Material"}'.
@jeffster23
From what I've seen, I'm not sure it makes sense to implement the insert functionality for arrays in local-data-api since Data API does not itself support this, to my knowledge. Providing more functionality than the actual AWS Data API would be as bad - actually, maybe worse! - as not providing as much.
I agree.
Since you are likely to get inquiries related to inserting arrays, though, you could always include some guidance in your documentation, even if local-data-api itself does not address it. As for me, I accomplish this insertion (with both local-data-api and AWS Data API) by constructing SQL as, e.g.:
It's a great suggestion. The first milestone is to support the array on the return value, and I will copy your sample to documents. Also, If DataAPI will support inserting array value then local-data-api will support inserting array value as the second milestone.
Thank you very much 😄
@ormu5
I have released a new version as 0.6.10
.
Would you please check it?
Busy week this week, happy to look at this soon, though.
This looks good! I tried with text/int/float arrays in Postgres and received the following correct column metadata:
'columnMetadata': [
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'name', 'name': 'name', 'nullable': 0, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 12, 'typeName': 'text'},
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': True, 'isCurrency': False, 'isSigned': False, 'label': 'attributes', 'name': 'attributes', 'nullable': 1, 'precision': 2147483647, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_text'},
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': False, 'isCurrency': False, 'isSigned': True, 'label': 'testing_int_array', 'name': 'testing_int_array', 'nullable': 1, 'precision': 10, 'scale': 0, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_int4'},
{'arrayBaseColumnType': 0, 'isAutoIncrement': False, 'isCaseSensitive': False, 'isCurrency': False, 'isSigned': True, 'label': 'testing_float_array', 'name': 'testing_float_array', 'nullable': 1, 'precision': 17, 'scale': 17, 'schemaName': '', 'tableName': 'attributes', 'type': 2003, 'typeName': '_float8'}
]
Note 'typeName' having notation of '_
And corresponding payload returned looks good, too, a couple of my test records being:
[{'stringValue': 'Thing'}, {'arrayValue': {'stringValues': ['Attribute']}}, {'arrayValue': {'longValues': [1, 3]}}, {'arrayValue': {'doubleValues': [1.231, 3.231]}}],
[{'stringValue': 'Another Thing'}, {'arrayValue': {'stringValues': ['Attribute']}}, {'arrayValue': {'longValues': [1, 3]}}, {'arrayValue': {'doubleValues': [1.231, 3.231]}}]
Note arrays correctly in 'arrayValue' object whose nested object correctly has Data API type for key and value of properly-parsed values as list.