陈晓龙（Xiaolong Chen） issues

Results 3 issues of


                                            陈晓龙（Xiaolong Chen）

Informer和Autoformer的解码器部分完全没有发挥应有的作用

基于`Transformer`的模型，其解码器在训练的时候可以采用`teacher-forcing`进行加速训练，在后期还可以根据epoch控制真实值和预测值的比例。但是在评估、推断/预测阶段，解码器的输入应该根据其上一时刻的输出迭代更新，再继续作为输入，以达到自回归的目的。 ```python dec_in : [batch, pred_len, d_model]，为简化，假设batch和d_model都为1 dec_out : [batch, pred_len, d_model] dec_in = [[[0], [0] ... [0]]] # 初始化的解码器输入，用来预测第一个时刻的值，模型输出为dec_out_1 ... dec_in = [[dec_out_1[0, 0], [0] ... [0]]] #...

graphcast调用的库函数被官方废除的问题

在train_graphcast.py文件的train方法中，有下面两行代码。 ```python param_groups = timm.optim.optim_factory.add_weight_decay(model, args.weight_decay) optimizer = torch.optim.AdamW(param_groups, lr=args.lr, betas=(0.9, 0.95)) ``` 我查看`timm`的文档，和`timm.optim.optim_factory`的源代码都没有找到`add_weight_decay`方法。代码中`args.weight_decay`参数是一个浮点数`0.05`。请问这里的作用是不是等效于下面代码？因为我不确定第一行代码中传入`model`是否有其他效果？ ```python optimizer = torch.optim.AdamW(lr=args.lr, weight_decay=args.weight_decay, betas=(0.9, 0.95)) ```

AttMoE代码问题——未提供修改后的MoE模型代码

![图片](https://github.com/XiuzeZhou/RUL/assets/60612507/0f4c2a47-c6b1-48ef-9eaf-7a24180f740f) 这里是MoE是用的`https://github.com/XiuzeZhou/mixture-of-experts`这个代码吗？也就是`https://github.com/davidmrau/mixture-of-experts`。但是你在`AttMoE-NASA.ipynb`和`AttMoE-CALCE.ipynb`中定义了下面模型。其中MoE的参数不是原始库的参数，不知道后续修改了哪些部分，且您的库中没有提供修改后的代码和原始MoE代码的来源。 ```python from mixture_of_experts import MoE class AttMoE(nn.Module): def __init__(self, feature_size=16, hidden_dim=8, num_layers=1, nhead=4, dropout=0., dropout_rate=0.2, num_experts=8, device='cpu'): super(AttMoE, self).__init__() self.feature_size, self.hidden_dim = feature_size, hidden_dim self.dropout = nn.Dropout(dropout_rate) self.cell...