neat-ml icon indicating copy to clipboard operation
neat-ml copied to clipboard

support creating positive/negative test/train splits

Open justaddcoffee opened this issue 3 years ago • 1 comments

Need to decide how this would work. Right now it's BYOH (bring your own holdouts), and they are supplied like this:

graph_data:
  graph:
    node_path: tests/resources/test_graphs/pos_train_nodes.tsv
    edge_path: tests/resources/test_graphs/pos_train_edges.tsv

  pos_validation:
    edge_path: tests/resources/test_graphs/pos_valid_edges.tsv
  neg_training:
    edge_path: tests/resources/test_graphs/neg_train_edges.tsv
  neg_validation:
    edge_path: tests/resources/test_graphs/neg_valid_edges.tsv

One way to support either BYOH or having NEAT make holdouts:

graph_data:
  graph:
    node_path: tests/resources/test_graphs/pos_train_nodes.tsv
    edge_path: tests/resources/test_graphs/pos_train_edges.tsv

  holdout:
    make_holdouts:
      type: connected_holdout # only option at the moment
      random_state: 42 # seed
      train_size: 0.8 # fraction
      edge_types: # optional
        - biolink:interacts_with
        - biolink:has_gene_product
      verbose: bool

    existing_holdouts:  # this OR make_holdouts (not both)
      pos_validation:
        edge_path: tests/resources/test_graphs/pos_valid_edges.tsv
      neg_training:
        edge_path: tests/resources/test_graphs/neg_train_edges.tsv
      neg_validation:
        edge_path: tests/resources/test_graphs/neg_valid_edges.tsv

justaddcoffee avatar Feb 10 '21 22:02 justaddcoffee

(note to self - I've already implemented this, essentially just need to expose the code to the YAML)

justaddcoffee avatar Feb 17 '21 01:02 justaddcoffee