陈国昊
陈国昊
As title. It would be of great help if all the family of OPT is supported.
你好,我注意到代码中支持int8下的推理。请问有关于_quantize_int_activation_函数使用的例子吗
请问该方法产生的8 bit量化模型,理想实现下在运行时能减少多少显存占用呢
I am verifying the effect of pairwise cross entropy, and implement it with the following code: ```python class SoftCrossEntropyLoss(nn.Module): def __init__(self): super().__init__() def forward(self, y_hat, y): p = F.log_softmax(y_hat, 0)...
I'm confused of how the covariance is calculated in the code, the result of covariance function is different from what np.cov gives. I'm sorry if I have misunderstood.
Hi, Thank you for your outstanding contribution. I'd like to bring to your attention a potential concern in implementation. It's possible that it might inadvertently enable gradient in adaptation (see...
What is the running environment configuration for the running demonstration
Hi, thanks you for sharing such an amazing work. To use MeZO more easily, could you provide a minimum demo to show how can we use MeZO as an optimizer...
For example, when the input is [0.9, 0.1], the label is [0] and n_bins=1, the SCELoss will result in a great value because the acc_matrix following code's calculation is [True,...
To conduct further research, it would be of great help if pytorch's codes are available.