gnn icon indicating copy to clipboard operation
gnn copied to clipboard

Can't consume part schemas

Open SidneyLann opened this issue 2 years ago • 2 comments

Because the node_sets and edge_sets are dynamic generated, the node types and edge types are different for different graph tensors, so the different node_sets and edge_sets should have the ability to use a part of a static schema, like: edge_sets={'A-B': tfgnn.EdgeSetSpec()} or edge_sets={'B-C': tfgnn.EdgeSetSpec()} in program to use edge_sets={'A-B': tfgnn.EdgeSetSpec(), 'B-C': tfgnn.EdgeSetSpec()} in schema file. But this will have the problem: f"generator yielded an element of {values_spec} where an element of {output_signature} was expected."

SidneyLann avatar May 07 '22 20:05 SidneyLann

Million combinations need to write million schema files?

SidneyLann avatar May 07 '22 20:05 SidneyLann

Every graph tensor instance which has different node sets or edge sets need one schema, this cause impossible to write schema for million of graph tensor instances!

SidneyLann avatar Jun 04 '22 22:06 SidneyLann

Hi @SidneyLann, thanks for reaching out!

What you describe is an inherent challenge of GNN modeling. I'll try and explain below, but with all due respect for the valid question: I'm going to close this issue, because there is no actionable feature request or defect report for the TF-GNN library here.

Background:

Machine Learning is all about generalizing from training examples to new and unseen examples. But that only goes so far: an image classification model trained on pixel data won't be able to make a useful prediction about a time series or a piece of text. Likewise, but more subtly, a Graph Neural Network on heterogeneous graphs needs to learn weights to handle the distinct node sets and edge sets in its input, and their peculiar features. That won't work well with thousands of distinct node sets and edge sets.

Maybe your problem can be reframed in terms of much fewer distinct types of objects (node sets) and relations (edge sets), and their finer distinctions are encoded as feature values. (For a dozen different types you could try a one-hot encoding; for thousands of types, a trainable embedding table.)

For starters, I'd recommend to try something with at most five node sets and ten edge sets -- or even just one! --, and then experiment with the design tradeoffs.

Even though the graph schema of the model input has to be fixed, this doesn't mean every graph has to have nodes in each node set. The schema defines the available node sets and edge sets; a particular graph can leave some of them empty (present, but with zero elements) and still conform to the schema.

arnoegw avatar Mar 24 '23 09:03 arnoegw

a particular graph can leave some of them empty (present, but with zero elements)----ok then if this work. Thanks. I want to make industry product, so complex conbined must be supported.

SidneyLann avatar Mar 24 '23 10:03 SidneyLann