Liger-Kernel
Liger-Kernel copied to clipboard
Lenient Test
🐛 Describe the bug
https://github.com/linkedin/Liger-Kernel/blob/e249eee723978bf8610ff1ea2297d048a2417e20/test/transformers/test_swiglu.py#L46 https://github.com/linkedin/Liger-Kernel/blob/e249eee723978bf8610ff1ea2297d048a2417e20/test/transformers/test_geglu.py#L38
1e0 for fp32 and 1e4 for bf16
Seems a little excessive
If kernels don't cause models to diverge; this test ought to pass at much lower atols
Reproduce
No response
Versions
main
Thanks! I think this is because of the scale of inputs. Can you try take a stab and see if we can lower it?