cugraph [BUG] MG Property Graph add_vertex

Describe the bug When I try and add data

Cell In [21], line 17 13 #ddf = gdf 15 print(f"read recs {start_id} to {end_id} and now adding to PG") ---> 17 pG.add_vertex_data(ddf, vertex_col_name='id', type_name='paper') 19 #print(f"PG now contains {pG.get_num_vertices()} ") 22 rec_read = end_id

File ~/anaconda3/envs/cugraph_dev/lib/python3.9/site-packages/cugraph-22.8.0a0+166.gd98ddc69-py3.9-linux-x86_64.egg/cugraph/dask/structure/mg_property_graph.py:405, in EXPERIMENTAL__MGPropertyGraph.add_vertex_data(self, dataframe, vertex_col_name, type_name, property_columns) 398 # Ensure that both the predetermined vertex ID column name and vertex 399 # type column name are present for proper merging. 400 401 # NOTE: This copies the incoming DataFrame in order to add the new 402 # columns. The copied DataFrame is then merged (another copy) and then 403 # deleted when out-of-scope. 404 tmp_df = dataframe.copy() --> 405 tmp_df[self.vertex_col_name] = tmp_df[vertex_col_name] 406 # FIXME: handle case of a type_name column already being in tmp_df 407 tmp_df[self.type_col_name] = type_name ... File ~/anaconda3/envs/cugraph_dev/lib/python3.9/site-packages/numpy/core/_methods.py:44, in _amin(a, axis, out, keepdims, initial, where) 42 def _amin(a, axis=None, out=None, keepdims=False, 43 initial=_NoValue, where=True): ---> 44 return umr_minimum(a, axis, None, out, keepdims, initial, where)

TypeError: '<=' not supported between instances of 'str' and 'int'

Sep 12 '22 13:09 BradReesWork

Thanks. I can reproduce. This is actually an error in dask. Here is an example that goes through a similar code path that give the same error:

import dask.dataframe as dd
import pandas as pd
df = pd.DataFrame({"a": [1, 2], "b": [3, 4], 1:[5, 6]})
ddf = dd.from_pandas(df, npartitions=2)
ddf["c"] = df["a"]  # <-- gives the error you see

df.mean(axis=0)
ddf.mean(axis=0)  # <-- gives similar error

A workaround is to have all column names be the same dtype:

gdf.columns = gdf.columns.astype(str)

Sep 12 '22 21:09 eriknw

Fixing in https://github.com/dask/dask/pull/9485

Sep 13 '22 05:09 eriknw

This issue can be closed.

This is fixed in dask version 2022.9.1, which was released on September 19.

Sep 20 '22 03:09 eriknw

We may need a pin to a minimum version of dask. dask>=2022.9.1

Workaround is to use only strings for column names, no mixing of strings, ints, etc.

Sep 28 '22 15:09 rlratzel

closed via https://github.com/dask/dask/pull/9485

Oct 19 '22 15:10 rlratzel

cugraph
cugraph copied to clipboard

[BUG] MG Property Graph add_vertex_data crashes

cugraph cugraph copied to clipboard

[BUG] MG Property Graph add_vertex_data crashes

cugraph
cugraph copied to clipboard