ensmallen icon indicating copy to clipboard operation
ensmallen copied to clipboard

G.dump_nodes doesn't respect defaults

Open redst4r opened this issue 1 year ago • 0 comments

Hi,

I just noticed that the default values of Graph.dump_nodes() mentioned in the docs are different than what the code acutally does. From the python docs:

> G.dump_nodes?
...
verbose: bool = True
    Wether to show a loading bar while writing to file.
separator: str = '\t'
    What separator to use while writing out to file.
header: bool = True
    Wether to write out the header of the file.
nodes_column_number: int = 0
    The column number where to write the nodes.
nodes_column: str = "id"
    The name of the column of the nodes.
node_types_column_number: int = 1
    The column number where to write the node types.
node_type_column: str = "category"
    The name of the column of the node types.

At least I assumed those are defaults specified after the parameter names.

Turns out that if you run

# Just using Hetionet as an example here
from grape.datasets.hetionet import Hetionet
G = Hetionet()
G.dump_nodes('/tmp/node.tsv')

you actually get this:

> head /tmp/nodes.tsv
node_name
Anatomy::UBERON:0000002
Anatomy::UBERON:0000004
...
...

whereas I'd expect (from the defaults):

id    category
Anatomy::UBERON:0000002   Anatomy
...
  • the nodes_column defaults to node_name instead of id
  • the node_type_column is supposed to default to category, but the node type column is missing altogether

Not a big issue, but confusing behavior at first! Thanks for this great software package!!

redst4r avatar Aug 27 '23 12:08 redst4r