NeuralOperators.jl icon indicating copy to clipboard operation
NeuralOperators.jl copied to clipboard

Fix GNO example

Open yuehhua opened this issue 3 years ago • 25 comments

yuehhua avatar Jul 11 '22 03:07 yuehhua

Codecov Report

Merging #79 (c51ef55) into main (54602e6) will decrease coverage by 5.58%. The diff coverage is 81.81%.

@@            Coverage Diff             @@
##             main      #79      +/-   ##
==========================================
- Coverage   95.70%   90.11%   -5.59%     
==========================================
  Files          10       10              
  Lines         163      172       +9     
==========================================
- Hits          156      155       -1     
- Misses          7       17      +10     
Impacted Files Coverage Δ
src/graph_kernel.jl 67.74% <81.81%> (-32.26%) :arrow_down:

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

codecov[bot] avatar Jul 11 '22 03:07 codecov[bot]

Still need some revise, discussed F2F

foldfelis avatar Jul 15 '22 04:07 foldfelis

What is the loss after training?

YichengDWu avatar Jul 15 '22 05:07 YichengDWu

@MilkshakeForReal Please take a look at this, it is the relative L2 loss

Update:

Sorry I misunderstand the question. If you are asking about the value of the loss after training, I'll get back to you later owing to the example is not implemented correctly yet

foldfelis avatar Jul 15 '22 05:07 foldfelis

Yes, I‘m asking about the value. The reason I'm asking is that I don't see any magical power in GraphKernel here. It is just an NNConv and could really be almost any GNNConvs. And you see no justification for using it in the paper. The power of GNO is likely due to the encoder and sampling (or maybe something else?). Please let me know if the loss value is available, even if it's a large one.

YichengDWu avatar Jul 15 '22 06:07 YichengDWu

I am not sure what kind of magic you expect to see in the GNO. Just like the eqs you mentioned in #74 is nothing but a message passing NN?

foldfelis avatar Jul 15 '22 07:07 foldfelis

The main idea of neural operator is to learn the mapping in the spectral space. In GNO, the implementation is done by graph signal laplace transform and message passing neural network. @yuehhua is an expert of GNN, and maybe would like to explane more detail on this.

foldfelis avatar Jul 15 '22 07:07 foldfelis

I think we all are doing science. There should be no magic. All advantages GNO have over FNO is sampling. GNO doesn't require strictly grid sampling from input functions, but FNO do. GNO should come out with at most the same performance as FNO do.

yuehhua avatar Jul 15 '22 07:07 yuehhua

When you expect to get GNO by just implementing the convolutional layer, you are expecting magic. As I already said, you must also implement the encoder and Nyström approximation.

YichengDWu avatar Jul 15 '22 07:07 YichengDWu

Nyström approximation is already there and encoder is the GraphKernel. As for the graph convolutional layer, it is just a generalization of regular convolutional layer. The only magic thing should go to the non-linearity.

yuehhua avatar Jul 15 '22 07:07 yuehhua

Please read equation 7 carefully and make sure you understand what each item means

YichengDWu avatar Jul 15 '22 07:07 YichengDWu

So, you mean the projection $v_0(x) = P(x, a(x), a_{\epsilon}(x), \nabla a_{\epsilon}(x)) + p$?

yuehhua avatar Jul 15 '22 08:07 yuehhua

@MilkshakeForReal About the encoder, you mean the GaussianNormalizer?

foldfelis avatar Jul 15 '22 08:07 foldfelis

Of course Nyström approximation is the more important thing. It is not clear where you have implemented it. I'll check back but now I really have to go to bed.

YichengDWu avatar Jul 15 '22 08:07 YichengDWu

@MilkshakeForReal Take your time, and please feel free to open a PR if you still think there is anything wrong 😄

foldfelis avatar Jul 15 '22 08:07 foldfelis

The $a_{\epsilon}(x)$ is the encoded $a(x)$ and the encoder is just a linear transform...

foldfelis avatar Jul 15 '22 08:07 foldfelis

Ok I'm back. In the spirit of scientific research, I don't want to discourage you from trying different things out. I do want to see the loss(es) first. But for now allow me to have a comment on Equation 7.

No it's not the GaussianNormalizer. It normalizes the data, not smoothes it. The smoothed functions are already generated in the data, see here. Unfortunately, there is no code to see how it is generated (or I missed that). The linear map P is less important I was not talking about that. Please note the motivation behind it

Due to the smoothing effect of the inverse elliptic operator in (3) with respect to the input data a (and indeed f when we consider this as input)

So the authors know how the solution operator acts on a(x), and encode that info into the input. The model they actually use is restricted on the elliptic PDE (4). And in the original paper they only test their model on this particular PDE :sweat_smile:. You can try removing $a_{\epsilon}(x)$ from the input and see how it affects the performance. It will show how general GNO actually is.

YichengDWu avatar Jul 15 '22 14:07 YichengDWu

GNO doesn't require strictly grid sampling from input functions, but FNO do.

No FNO does not require that, only FFT does. General DFT can be performed on a nonuniform grid.

YichengDWu avatar Jul 15 '22 14:07 YichengDWu

No FNO does not require that, only FFT does. General DFT can be performed on a nonuniform grid.

Oh, yeah that's true. But could you give me nonuniform general DFT in practice?

yuehhua avatar Jul 15 '22 14:07 yuehhua

@yuehhua Is this ready to merge?

foldfelis avatar Jul 17 '22 18:07 foldfelis

@MilkshakeForReal For performing on a irregular domain, you may want to check Geo-FNO.

@foldfelis CPU works, but not GPU. So, currently, not yet.

yuehhua avatar Jul 19 '22 21:07 yuehhua

Thanks for the info. Is Nyström approximation implemented in GraphSignals.generate_grid?

YichengDWu avatar Jul 21 '22 19:07 YichengDWu

@MilkshakeForReal To my understand, there is no implementation for Nyström approximation, Nyström approximation is just a way to approximate kernel in RHKS. Thus, the model can really be computable, otherwise it is just an abstract mathematical concept. Please check paper in section 3 in p.8.

yuehhua avatar Jul 21 '22 23:07 yuehhua

I wouldn't need to, unless there is a new version just published. If you haven't implemented that's what I need to know.

YichengDWu avatar Jul 22 '22 00:07 YichengDWu

In other words, GraphKernel, which is the GNN approximator itself, is the result of Nyström approximation of kernel for PDE.

yuehhua avatar Jul 22 '22 02:07 yuehhua

@yuehhua Great work, Tks

foldfelis avatar Aug 20 '22 10:08 foldfelis

@yuehhua Great work, Tks

foldfelis avatar Aug 20 '22 10:08 foldfelis