Less Wright
Less Wright
Hi @IssamLaradji - Monday works great. FastAI does not have lbfgs...I've had some discussions with Jeremy about how FastAI v2 can support optimizers like SLS, AliG, etc. that require passing...
> What is "IMA" short for? Illegal Memory Access - the generic cuda error that something has exceeded it's memory index.
> Thanks @lessw2020 . Do you think the IMA relates to the triton kernel? Can you help fix it? PP needs this fix to land. Would appreciate your help. Hi...
Hi @kwen2501 - sure, here's the specific line that has the issue. https://github.com/pytorch/torchtitan/blob/f72a2a0da0bdfc394faaab9b3c0f35d0b6f5be50/torchtitan/models/norms.py#L198 That is loading the inputs and masking off any values past the known col length and should...
Happy to work on this (actually had started a doc this weekend). Thanks for adding the tracking issue @rohan-varma and for the feedback @stas00!
@lxuechen - thanks for the reminder here. We have been using AnyPrecision, so let's see about getting it into TorchMM for a home and then can also add documentation. Will...
> Is 2 saying that in order to have "full" compile you need to set both compile=true and compile_rmsnorm = true I updated the text to be more specific, but...
Hit the same issue (627 nightlies) and talked with @kwen2501 about it. This is becoming more urgent. This issue has been open for 3 weeks...would anyone be able to address...
updating - the same error blocks tracing of Llama3-8B, so it's continuing to block on more models. tested with 2.5.0.dev20240630+cu121 ``` [rank0]:[rank0]: Traceback (most recent call last): [rank0]:[rank0]: File "/home/less/local/miniconda3/envs/inference/lib/python3.10/site-packages/torch/distributed/pipelining/_IR.py",...
> @lessw2020 thanks so much for making these changes! I think our logging to console can definitely be better and these metrics generally make a lot of sense to me....