Create AWS Glue table from table JSON
Question
In AWS Web console, you can create and display details of AWS Glue table. After the table has been created, AWS Web console will display a generated json, which represents this table.
AWS Glue JSON Example:
[
{
"Name": "animal_id",
"Type": "bigint",
"Parameters": {
"iceberg.field.current": "true",
"iceberg.field.id": "1",
"iceberg.field.optional": "true"
}
},
{
"Name": "animal_name",
"Type": "string",
"Parameters": {
"iceberg.field.current": "true",
"iceberg.field.id": "2",
"iceberg.field.optional": "true"
}
}
]
Can I use this JSON in order to re-created pyicebberg objects in python script? Basicaly parsing above would result in following:
from pyiceberg.schema import Schema
from pyiceberg.types import StringType, NestedField, IntegerType
schema = Schema(
NestedField(field_id=1, name="animal_id", field_type=IntegerType(), required=False),
NestedField(field_id=2, name="animal_name", field_type=StringType(), required=False),
)
Is it possible to use such JSON in order to create Iceberg schema in AWS? This way it would be nicely replicable.
You can certainly create a Python function to translate from the JSON to Iceberg Schema. You might need to hard code some type mapping though. For example, there's no bigint in the Iceberg Schema types.
Another option is to translate the JSON to pyarrow schema and use pyarrow_to_schema to create Iceberg Schema
@kevinjqliu would you mind if such functionality would become a part of pyiceberg? I think it would be handy if we could put table schemas json into config file, and simply load them if needed.
I think it would be handy if we could put table schemas json into config file, and simply load them if needed.
What would that look like? I'm not too familiar with "AWS Glue JSON". It would be handy, but I'm not sure where we would put such functionality in the repo. It's possible that this is already implemented in catalog/glue.py since we're doing some translation between Glue and Iceberg Schema.
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'