spektral icon indicating copy to clipboard operation
spektral copied to clipboard

Batch mode in topkpool

Open vshlroot opened this issue 4 years ago • 8 comments

Hi,

Is there a way to use batch mode in topk pool? Right now it only supports single and disjoint mode.

My requirement is: I was trying to implement graph unet using your topkpool layer. To train, I have to do backprop on a batch size (of say 50) (each example is a graph instance with same Adjacency matrix).

Issue: Batch mode is not supported in topkpool and looping over each example is not possible. So, I was wondering if there is a way to do it.

Same is the case with SAGPool layer too.

Any suggestions?

Cheers!

vshlroot avatar May 08 '20 20:05 vshlroot

You can use spektral.utils.numpy_to_disjoint to convert your batches into disjoint graphs and then use each such graph for your gradient calculations.

LeviBorodenko avatar May 08 '20 20:05 LeviBorodenko

Hi

Thanks for a quick response. Much appreciated. I have some further questions:

  1. The function is for inputs of type numpy: Using example at https://graphneural.network/data/#disjoint-mode I tried to use tensors as input but it returns numpy. Issue: My X_list and A_list are coming as tensors from other DL modules and converting tensors to numpy to tensors again will be too slow. For example, doing this conversion after each layer of graph unet will be too slow. Any work around?

  2. What to do when Adjacency matrix is not same for all examples?

  3. I think the issue in having a batch mode is calculating topk elements of each batch. If yes, then tf.math.top_k calculates topk elements for a 2D tensor also (hence can be applied in batch mode with reshape etc). If this is not the case then, can u explain what is the issue, may be I can implement it on my own if you give me some pointers on it.

Cheers!

vshlroot avatar May 09 '20 09:05 vshlroot

I think it would be a good idea to develop a tensor based disjoint to batch converting layer. It would not be very efficient and it would discard sparsity but at least it allows for interaction between layers that would otherwise be fully incompatible.

I am thinking of a layer Disjoint2Batch that takes as input an disjoint signal X, its adjacency A and segment IDs I and returns X and A in batch mode -- where the node dimension is padded to keep it constant.

LeviBorodenko avatar May 09 '20 20:05 LeviBorodenko

Hi,

I think it should possible to have a batch mode TopK. I've taken a look at tf.math.top_k and the only issue I see is finding a way to use the top-k indices to reduce the adjacency matrices and features. I'll have a look at it in the following days and see how to implement it.

Note that if you have a single adjacency matrix and a batch of node features, then things may be more complicated. All pooling layers implemented in Spektral depend on the node features, so for each sample of X you would get a different reduced adjacency matrix. DiffPool and MinCut currently support this, but their performance is abysmal in this setting.

@LeviBorodenko the Disjoint2Batch layer is not a bad idea, again the only issue I see is converting the sparse indices of A to indices for a 3D tensor.

Cheers, Daniele

danielegrattarola avatar May 11 '20 06:05 danielegrattarola

@danielegrattarola I will submit a PR with a prototype Disjoint2Batch layer later today.

Best, Levi

LeviBorodenko avatar May 11 '20 11:05 LeviBorodenko

Hi

I have implemented a topk layer that works for my needs (dense inputs only). It was straight forward, bit of reshaping, tiling etc.

On a side note, do you know an alternative to tf.scatter_nd in batch mode....Turns out tf.gather_nd supports batch mode but scatter_nd does not....Probably not the right place to ask this...

vshlroot avatar May 17 '20 06:05 vshlroot

@vshlroot good to know! I am currently in the middle of a rewrite of the pooling layers, I hope to push an updated version soon. If you want to share your code here I'm happy to merge it into the library.

Not sure what problems you're having with scatter_nd, shouldn't it be just a matter of defining the correct indices?

Best, Daniele

danielegrattarola avatar May 17 '20 14:05 danielegrattarola

Hi,

Unfortunately a lot of people are involved in my project and I can't share the code as it is but I will be more than happy to answer any questions (though I doubt you'll need my inputs).

I used, tf.math.top_k to get indices and one has to tile the indices to align with gather_nd's needs. Pool was straight forward in comparison to unpool layer where scatter_nd is to be used. So my understanding was (and basics of programming also expect) that following should work as it is:

idx = tf.math.top_k() pool using gather_nd(idx, ...) unpool using scatter_nd(idx,....)

I mean there should be a consistency between the two functions gather_nd and scatter_nd considering they do operations exactly opposite of each other. But because scatter_nd does not honor any batch_dim like gather_nd, it was a bit difficult and took some time for me to understand indices a bit (I am new to tf). The above sequence of statements usually run right away in numpy or pytorch.

I also don't get it why a basic assignment to certain locations of a tensor is not trivial in tf.

vshlroot avatar May 18 '20 18:05 vshlroot