Pangu-Weather
Pangu-Weather copied to clipboard
The shape of EarthSpecificBias
Hi, thinks for your great work.
I'm trying to reproduce your code, and one of the things that has been holding me back for a long time.
self.type_of_windows = (input_shape[0]//window_size[0])*(input_shape[1]//window_size[1])
self.earth_specific_bias = ConstructTensor(shape=((2 * window_size[2] - 1) * window_size[1] * window_size[1] * window_size[0] * window_size[0], self.type_of_windows, heads))
It seems from the article that type_of_windows should be Mlat * Mpl (* Batch Size). In this way the dimension of earth_specific_bias is matching the article.
However, in the subsequent code
EarthSpecificBias = self.earth_specific_bias[self.position_index]
EarthSpecificBias = reshape(EarthSpecificBias, target_shape=(self.window_size[0]*self.window_size[1]*self.window_size[2], self.window_size[0]*self.window_size[1]*self.window_size[2], self.type_of_windows, self.head_number))
EarthSpecificBias = TransposeDimensions(EarthSpecificBias, (2, 3, 0, 1))
EarthSpecificBias = reshape(EarthSpecificBias, target_shape = [1]+EarthSpecificBias.shape)
attention = attention + EarthSpecificBias
My understanding is that EarthSpecificBias = reshape(EarthSpecificBias, target_shape = [1]+EarthSpecificBias.shape) adds one dimension to the EarthSpecificBias ?
But then the dimensionality of attention doesn't match the dimensionality of EarthSpecificBias, where the former one is a four-dimensional tensor of (B * Mlat * Mlon * Mpl, heads, Wlat * Wlon * Wpl, Wlat * Wlon * Wpl) while the latter is a five-dimensional tensor of (1, B * Mlat * Mpl, heads, Wlat * Wlon * Wpl, Wlat * Wlon * Wpl).
Hi,
There might be some inconsistency in the pseudocode. What I can say is:
- The quantity of
type_of_windows
is indeed M_lat * M_pl. - The dimensionality of
attention
is correct. - You can duplicate along the direction of W_lon to match the dimensionality.