arcgis-python-api
arcgis-python-api copied to clipboard
df.spatial.to_table error
Describe the bug This error has been present for a while. I was avoiding it by manually constructing the table and inserting data with arcpy cursor. However, now the cursor does not accept the data anymore.
As a workaround, I created a feature class with points at 0,0 and copied the data to a table, but this is temporary and I hope to find a permanent solution.
To Reproduce Steps to reproduce the behavior:
import pandas as pd
from numpy import dtype
from pandas import Int64Dtype
from arcgis.features import GeoAccessor
import os
result = '{"holeid":{"0":"test","1":"test"},"projectcode":{"0":"test","1":"test"},"geolfrom":{"0":0.0,"1":3.21},"geolto":{"0":3.21,"1":6.31},"priority":{"0":1,"1":1},"shd_column":{"0":null,"1":null},"shd_core_box":{"0":1,"1":2},"shd_line":{"0":null,"1":null},"shd_numberid":{"0":"","1":""},"shd_pallet":{"0":"","1":""},"shd_side":{"0":"","1":""},"shd_obs":{"0":null,"1":null},"shd_core_obs_big":{"0":"","1":""},"shd_street":{"0":"","1":""},"shd_floor":{"0":null,"1":null},"shd_exclude_approve":{"0":null,"1":null},"shd_exclude_date":{"0":null,"1":null},"shd_exclude_type":{"0":null,"1":null},"shd_pallet_weight_kg":{"0":null,"1":null},"shd_core_load_date":{"0":null,"1":null},"shd_core_load_resp":{"0":"","1":""},"shd_start_d":{"0":1420070400000,"1":1420070400000},"shd_depth_d":{"0":560.2,"1":560.2},"shd_interval_d":{"0":"OK","1":"OK"},"shd_prospect_d":{"0":"test","1":"test"}}'
df = pd.read_json(result, dtype={'holeid': dtype('O'),
'projectcode': dtype('O'),
'geolfrom': dtype('float64'),
'geolto': dtype('float64'),
'priority': Int64Dtype(),
'shd_column': Int64Dtype(),
'shd_core_box': Int64Dtype(),
'shd_line': Int64Dtype(),
'shd_numberid': dtype('O'),
'shd_pallet': dtype('O'),
'shd_side': dtype('O'),
'shd_obs': dtype('O'),
'shd_core_obs_big': dtype('O'),
'shd_street': dtype('O'),
'shd_floor': Int64Dtype(),
'shd_exclude_approve': dtype('O'),
'shd_exclude_date': dtype('O'),
'shd_exclude_type': dtype('O'),
'shd_pallet_weight_kg': dtype('float64'),
'shd_core_load_date': dtype('<M8[ns]'),
'shd_core_load_resp': dtype('O'),
'shd_start_d': dtype('<M8[ns]'),
'shd_depth_d': dtype('float64'),
'shd_interval_d': dtype('O'),
'shd_prospect_d': dtype('O')})
df.spatial.to_table(os.path.join(r'your-gdb\path.gdb', 'test'), sanitize_columns=False)
The new API returns pandas Ints, and I tried to enforce Int64 data type with
Enforcing Ints does not works.
to_int = ['priority', 'shd_column', 'shd_floor', 'shd_line', 'shd_core_box']
df[to_int] = df[to_int].astype('Int64')
Neither worked.
error:
```python
SystemError: <da.funcInfo object at 0x000002475FDEA330> returned NULL without setting an error
Screenshots

Expected behavior I suspect is some erro in arcpy.da and because of changes in version 2.1.0 I cant fix it myself on my side right now.
Platform (please complete the following information):
- OS: windows 10
- Python API Version 2.10
Part of the issue is the Object dtypes, you should convert them to the actual dtypes for the output table.
dtype('O') with all NULL data has no way to infer anything. So how can a strongly typed table structure, like a feature class or FGDB table, be created?
I believe that when using pandas to read data from a CSV file, if the data in the DataFrame contains values of data type 'O' (object), it should raise an error or issue a warning to indicate that it will be treated as a string.
This is especially important because pandas defaults to data type 'O' when it cannot infer the dtype, and as users, we may not always be diligent in specifying the correct data types. Interestingly, this issue seems to be resolved when using the to_featureclass.
@achapkowski To clarify I'm doing something like this to avoid .to_table

I would like to suggest that the to_table and from_table functions in ArcPy be reviewed for a possible bug, as shown in the provided images. The issue appears to occur when using arcpy.da.ExtendTable. In comparison, when using arcpy.da cursor, the problem does not arise, as shown in the corresponding image.
The same code with arcpy cursor, as stateded.

Furthermore, both Pandas and NumPy have data type information available. If the data is of a generic type, it should be converted to a string to avoid "guessing". If it cannot be casted to a string, it should throw an error. However, if the column has a supported data type, it should be able to be written and read properly without any issues.
The same "data" with points works, which is why I believe it should be taged as bug.
Thanks any ways.
I am seeing the same bug when using df.spatial.to_table. This worked perfectly for me before updating ArcGIS Pro to 3.2 (was on 2.9). I can test my same script on a workstation with Pro 2.9 and it still works...but when I run in Pro 3.2, it fails with this error: TypeError: 'field_names' must be string or non empty list or tuple of strings
#1733 and This might be related. I've been avoiding to table because the default uses arcpy numpy array.