Issue performing geometry operation in GeoJSON file "Error: Referenced column "geometry" not found in FROM clause"
Hi,
I am exploring Duckdb to work with spatial files.
I am trying to import GeoJSON files and perform ST_UNION. I can read the files without any issue, but when performing ST_UNION operation, it result in the following error:
An error occurred while performing ST_UNION: Binder Error: Referenced column "geometry" not found in FROM clause! Candidate bindings: "geo_data_0.geom", "geo_data_0.Reference_Geography", "geo_data_0.ObjectId" LINE 5: SELECT geometry FROM geo_data_0
- Can someone help me explain the issue?
- Looks like there is no compatibility in reading feature class of ESRI Geodatabase. Please confirm.
FYI: Somebody raised similar error a year ago. However, I have no idea whether it is updated or not. Please kindly share your comments
Hello!
-
What is the actual SQL query you're trying to run? As the error message explains, you're trying to pass a column named "geometry" to the ST_Union function, but there is no column with that name. Supposedly you got a table with a column named "geom" when you imported the geojson file.
-
Im not too familiar with ESRI Geodatabases, but what have you tried? I assume you've tried to pass a gdb file to
st_read? Can you share the GDB file? -
If there are other related issues it's generally a good idea to link them so that we know what you're talking about.
Below is the code I used.
- Yes, there is a geometry column which I can see.
- It is one of the geodatabase that I have downloaded and it opens correctly. What I encountered was error with respect to path. I am trying to fix this. If I have issues, I will share my code.
Thank you for your kind time.
` import duckdb import requests
geojson_urls = [ 'https://services-eu1.arcgis.com/BuS9rtTsYEV5C0xh/arcgis/rest/services/CensusHub2022_T1_2_LEA/FeatureServer/0/query?outFields=&where=1%3D1&f=geojson', 'https://services-eu1.arcgis.com/BuS9rtTsYEV5C0xh/ArcGIS/rest/services/grid_population_2022_view/FeatureServer/0/query?outFields=&where=1%3D1&f=geojson' # Replace with your second URL ]
con = duckdb.connect()
con.execute("INSTALL spatial;") con.execute("LOAD spatial;")
for index, geojson_url in enumerate(geojson_urls):
response = requests.get(geojson_url)
response.raise_for_status() # Raise an error for bad responses
if 'application/json' not in response.headers.get('Content-Type', ''):
raise ValueError(f"The URL {geojson_url} did not return a valid GeoJSON file.")
local_geojson_path = f'data_{index}.geojson'
with open(local_geojson_path, 'w', encoding='utf-8') as f:
f.write(response.text)
try:
query = f'''
CREATE TABLE geo_data_{index} AS SELECT * FROM ST_Read('{local_geojson_path}');
'''
con.execute(query)
except Exception as e:
print(f"An error occurred while reading the GeoJSON file from {local_geojson_path}: {e}")
try: union_query = ''' CREATE TABLE geo_data_union AS SELECT ST_Union(geometry) AS geometry FROM ( SELECT geometry FROM geo_data_0 UNION ALL SELECT geometry FROM geo_data_1 ); '''
con.execute(union_query)
union_result = con.execute("SELECT * FROM geo_data_union").fetchdf()
print("Union of data_0 and data_1:")
print(union_result)
except Exception as e: print(f"An error occurred while performing ST_UNION: {e}")
con.close()`
Geometry is the name of the type not the column. The geometry column you get from ST_Read is called "geom", not "geometry"
Thank you for your reply.
I have issue using that url of GeoJSON. I have downloaded two GeoJSON files for a country and tried to do the same operations. It works and creates output.
Columns in GeoJSON_data1: [('OBJECTID', 'INTEGER', 'YES', None, None, None), ('FCsubtype', 'INTEGER', 'YES', None, None, None), ('geom', 'GEOMETRY', 'YES', None, None, None)]
`import duckdb
Connect to DuckDB
conn = duckdb.connect()
Install and load the spatial extension
conn.execute("INSTALL spatial;") conn.execute("LOAD spatial;")
Load the first GeoJSON file
conn.execute(""" CREATE TABLE geo_data1 AS SELECT * FROM ST_Read('M:\R\DuckDb\gEOjson\Built_Up_Areas.geojson') """)
Load the second GeoJSON file
conn.execute(""" CREATE TABLE geo_data2 AS SELECT * FROM ST_Read('M:\R\DuckDb\gEOjson\Vegetation_Areas.geojson') """)
Check the column names in geo_data1
columns_geo_data1 = conn.execute("DESCRIBE geo_data1").fetchall() print("Columns in geo_data1:", columns_geo_data1)
Check the column names in geo_data2
columns_geo_data2 = conn.execute("DESCRIBE geo_data2").fetchall() print("Columns in geo_data2:", columns_geo_data2) for row in columns_geo_data2: print(row)
Example of performing a geometry operation: calculating the area of geometries in the second GeoJSON
area_result = conn.execute(""" SELECT ST_Area(geom) AS area FROM geo_data1 """).fetchall()
Print the area results
print("Areas of geometries in the second GeoJSON file:") for area in area_result: print(area)
Example of a spatial join or union between the two datasets
union_result = conn.execute(""" SELECT a.OBJECTID, b.OBJECTID, ST_Union(a.geom, b.geom) AS union_geom FROM geo_data1 AS a JOIN geo_data2 AS b ON ST_Intersects_Extent(a.geom, b.geom) """).fetchall()
Create a new table for the union results
conn.execute(""" CREATE TABLE union_table AS SELECT a.OBJECTID AS id_a, b.OBJECTID AS id_b, ST_AsText(ST_Union(a.geom, b.geom)) AS union_geom FROM geo_data1 AS a JOIN geo_data2 AS b ON ST_Intersects_Extent(a.geom, b.geom) """)
Export the union results to a CSV file
output_directory = 'M:\R\DuckDb\union_results.csv' conn.execute(f"COPY union_table TO '{output_directory}' (FORMAT CSV, HEADER)")
Close the connection
conn.close()
print(f"Union results exported to {output_directory}") `