Ali F
Ali F
I think if you run the disable command the link created by the ln command is removed and the service will be completely removed.
This should be in the README
does this look like a correct implementation? ```python @implements([aten.linear.default]) def mx_mm(aten_op, args, kwargs=None): a = args[0] b = args[1] if len(args) > 2: c = args[2] else: c = None...
@vkuzo I did see the `aten.mm` and `aten.addmm` implementations. But for some reason in my case when `F.linear()` is called in MXLinear, `aten.linear.default` is used instead of `atten.addmm`. I don't...
@vkuzo I was using MXLinear in quantizing the inference of Llama3.1-8B. Maybe the reason could be that I was calling `F.linear` in a `torch.no_grad()` context? I finally decided to avoid...
@vkuzo Created the PR :+1:
The fact that they only mention accuracy for up to 2 or 3B parameter models in their papers and technical reports tells me that Microsoft tried to train larger models...
@vkuzo I tried different ways to trigger the use of aten.linear but couldn't
@sandorex I see your point, but what harm could an optional setting cause? You could add a warning saying that the filter might not capture your orphans properly so use...
Got the same issue. I installed torch using `conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia ` and cloned the autoawq repo and installed it from source.