neat-ml
neat-ml copied to clipboard
support creating positive/negative test/train splits
Need to decide how this would work. Right now it's BYOH (bring your own holdouts), and they are supplied like this:
graph_data:
graph:
node_path: tests/resources/test_graphs/pos_train_nodes.tsv
edge_path: tests/resources/test_graphs/pos_train_edges.tsv
pos_validation:
edge_path: tests/resources/test_graphs/pos_valid_edges.tsv
neg_training:
edge_path: tests/resources/test_graphs/neg_train_edges.tsv
neg_validation:
edge_path: tests/resources/test_graphs/neg_valid_edges.tsv
One way to support either BYOH or having NEAT make holdouts:
graph_data:
graph:
node_path: tests/resources/test_graphs/pos_train_nodes.tsv
edge_path: tests/resources/test_graphs/pos_train_edges.tsv
holdout:
make_holdouts:
type: connected_holdout # only option at the moment
random_state: 42 # seed
train_size: 0.8 # fraction
edge_types: # optional
- biolink:interacts_with
- biolink:has_gene_product
verbose: bool
existing_holdouts: # this OR make_holdouts (not both)
pos_validation:
edge_path: tests/resources/test_graphs/pos_valid_edges.tsv
neg_training:
edge_path: tests/resources/test_graphs/neg_train_edges.tsv
neg_validation:
edge_path: tests/resources/test_graphs/neg_valid_edges.tsv
(note to self - I've already implemented this, essentially just need to expose the code to the YAML)