parquet-wasm icon indicating copy to clipboard operation
parquet-wasm copied to clipboard

initialization of GeoArrowPolygonLayer({id: 'geoarrow-polygons'}): geometryColumn not Polygon or MultiPolygon

Open bghjmn32 opened this issue 1 year ago • 1 comments

Hi,

I am trying to rendering a parquet using parquet-wasm. I actually use the code manage to render .feather file, and I think I just need to a small changes to fit to .parquet file.

import React, { useState, useEffect } from 'react';
import { createRoot } from 'react-dom/client';  
import { DeckGL } from '@deck.gl/react';
import { GeoArrowPolygonLayer } from '@geoarrow/deck.gl-layers';
import { Map } from 'react-map-gl';
import mapboxgl from 'mapbox-gl';
import { tableFromIPC } from 'apache-arrow';
import initWasm, {readParquet} from 'parquet-wasm';

mapboxgl.accessToken = 'pk.eyJ1IjoiZ2FkaWhhcmlrcmlzaG5hIiwiYSI6ImNseHZuZDltajBuYjEya3NjNGNsazMwemIifQ.OMD0v9Ww2-4zeQS5lOH9DQ';

const GEOARROW_POLYGON_DATA = 'http://localhost:4000/parquet';

const App = () => {
  const [table, setTable] = useState(null);
  
  useEffect(() => {
    const fetchData = async () => {
      await initWasm();  // Initialize WebAssembly for parquet-wasm if necessary

      const response = await fetch(GEOARROW_POLYGON_DATA);
      if (!response.ok) {
        throw new Error(`Failed to fetch Feather file: ${response.statusText}`);
      }

      const buffer = await response.arrayBuffer();

      // const parquetUint8Array = new Uint8Array(buffer);
      // const arrowWasmTable = readParquet(parquetUint8Array);
      // const arrowTable = tableFromIPC(arrowWasmTable.intoIPCStream());

      const arrowTable = tableFromIPC(buffer);
      setTable(arrowTable);
    };
  // Data fetching operations when the component is mounted.

    fetchData().catch(console.error);
  }, []);

  const layers = [];
  if (table) {
    layers.push(
      new GeoArrowPolygonLayer({
        id: "geoarrow-polygons",
        stroked: true,
        filled: true,
        data: table,
        getFillColor: [0, 100, 60, 160],
        getLineColor: [255, 0, 0],
        lineWidthMinPixels: 1,
        extruded: false,
        wireframe: true,
        pickable: true,
        positionFormat: "XY",
        _normalize: false,
        autoHighlight: false,
        earcutWorkerUrl: new URL("https://cdn.jsdelivr.net/npm/@geoarrow/[email protected]/dist/earcut-worker.min.js"),
      })
    );
  }
  //the layer for rendering and adding GeoArrowPolygonLayer

  const initialViewState = {
    longitude: 11.53974,
    latitude: 48.14394,
    zoom: 7,
    pitch: 0,
    bearing: 0,
  };

  return (
    <DeckGL
      initialViewState={initialViewState}
      controller={true}
      layers={layers}
      style={{ position: 'absolute', top: 0, bottom: 0, width: '100%' }}
    >
      <Map
        mapboxAccessToken={mapboxgl.accessToken}
        mapStyle="mapbox://styles/mapbox/streets-v11"
      />
    </DeckGL>
  );
  //DeckGL: Used to render the map view of Deck.gl.
  //Map: used to render Mapbox maps.
};

// Render the React application
const rootElement = document.getElementById('root');
if (rootElement) {
  const root = createRoot(rootElement);
  root.render(<App />);
} else {
  console.error('Root element not found');
}

However I encounter this error here on website console:

deck: initialization of GeoArrowPolygonLayer({id: 'geoarrow-polygons'}): geometryColumn not Polygon or MultiPolygon Error: geometryColumn not Polygon or MultiPolygon at GeoArrowPolygonLayer.renderLayers (polygon-layer.js:106:15) at GeoArrowPolygonLayer._postUpdate (composite-layer.js:202:40) at GeoArrowPolygonLayer._update (layer.js:815:18) at GeoArrowPolygonLayer._initialize (layer.js:755:14) at LayerManager._initializeLayer (layer-manager.js:300:19) at LayerManager._updateSublayersRecursively (layer-manager.js:267:26) at LayerManager._updateLayers (layer-manager.js:234:14) at LayerManager.setLayers (layer-manager.js:174:14) at LayerManager.updateLayers (layer-manager.js:185:18) at Deck._onRenderFrame (deck.js:743:27)

I tried it exactly as documented

import { tableFromIPC } from "apache-arrow"; import initWasm, {readParquet} from "parquet-wasm"; // Instantiate the WebAssembly context await initWasm(); const resp = await fetch("https://example.com/file.parquet"); const parquetUint8Array = new Uint8Array(await resp.arrayBuffer()); const arrowWasmTable = readParquet(parquetUint8Array); const arrowTable = tableFromIPC(arrowWasmTable.intoIPCStream());

this code could works on the .feather file. if I keep in a simple form:

 const buffer = await response.arrayBuffer();
      // const parquetUint8Array = new Uint8Array(buffer);
      // const arrowWasmTable = readParquet(parquetUint8Array);
      // const arrowTable = tableFromIPC(arrowWasmTable.intoIPCStream());
      const arrowTable = tableFromIPC(buffer);
      setTable(arrowTable);

This is what this parquet looks like when opened in Qgis, it looks fine.

image

I would like to know where I should look for my mistakes? I begin this for 2 weeks and is studying.

bghjmn32 avatar Jul 05 '24 15:07 bghjmn32

the parquet file is extracted by

https://docs.rs/geoarrow/latest/geoarrow/io/parquet/fn.write_geoparquet.html

by python, I can see the structure and first lines:

Columns: Index(['ogc_fid', 'geometry'], dtype='object') First 5 rows: ogc_fid geometry 0 1 b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x05\x00... 1 2 b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x0b\x00... 2 3 b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x05\x00... 3 4 b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x06\x00... 4 5 b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x06\x00... Parsed Geometry column: 0 POLYGON ((9.513218699999998 47.23636180011158,... 1 POLYGON ((9.519631199999997 47.23754530011133,... 2 POLYGON ((9.520442899999999 47.23746610011135,... 3 POLYGON ((9.520369099999998 47.236682900111525... 4 POLYGON ((9.5201902 47.236778800111495, 9.5203... Name: geometry, dtype: object <class 'shapely.geometry.polygon.Polygon'> POLYGON ((9.513218699999998 47.23636180011158, 9.5132844 47.23633810011159, 9.5133167 47.236379500111596, 9.513250999999999 47.23640320011159, 9.513218699999998 47.23636180011158)) <class 'shapely.geometry.polygon.Polygon'> POLYGON ((9.519631199999997 47.23754530011133, 9.519741799999998 47.23749980011136, 9.5197765 47.237538600111336, 9.519899399999998 47.23748800011135, 9.519873 47.23745840011134, 9.5201127 47.23735980011138, 9.5201974 47.23745470011136, 9.519668000000001 47.2376725001113, 9.519634999999997 47.23763550011131, 9.519691 47.23761240011134, 9.519631199999997 47.23754530011133)) <class 'shapely.geometry.polygon.Polygon'> POLYGON ((9.520442899999999 47.23746610011135, 9.5205466 47.23742130011136, 9.5206411 47.23752200011133, 9.5205374 47.23756680011134, 9.520442899999999 47.23746610011135)) <class 'shapely.geometry.polygon.Polygon'> POLYGON ((9.520369099999998 47.236682900111525, 9.520467699999998 47.23664160011154, 9.5205466 47.23672860011152, 9.520448 47.2367698001115, 9.5203776 47.23669220011152, 9.520369099999998 47.236682900111525)) <class 'shapely.geometry.polygon.Polygon'> POLYGON ((9.5201902 47.236778800111495, 9.5203 47.2367333001115, 9.520373399999999 47.23681440011149, 9.520381299999999 47.23682360011151, 9.5202714 47.236869100111484, 9.5201902 47.236778800111495))

seems they are indeed right polygon.

bghjmn32 avatar Jul 05 '24 15:07 bghjmn32

This isn't a parquet-wasm issue.

GeoArrow has a difference between a Polygon native type and a WKB type. That WKB type can store polygons, but it's in a serialized form; the coordinates of the polygons are not directly accessible. deck.gl-layers only works with the native form; you need to convert your data to the native type in advance.

kylebarron avatar Jul 09 '24 03:07 kylebarron

In general you can use shapely. to_ragged_array to convert to something geoarrow-like.

If you want to see the specifics of how to convert to geoarrow, you can look into the source code of lonboard. Or you can use the to_arrow method of geopandas 1.0

kylebarron avatar Jul 09 '24 03:07 kylebarron