Vitaliy Chiley

Results 64 comments of Vitaliy Chiley

Added lion8b to [this](https://github.com/mosaicml/llm-foundry/pull/271#issuecomment-1595337924) ![Screenshot 2023-06-16 at 4 46 54 PM](https://github.com/mosaicml/llm-foundry/assets/6439018/6b87e35c-e1f8-4119-886a-cbbbe0c1d8cc) ![Screenshot 2023-06-16 at 4 47 11 PM](https://github.com/mosaicml/llm-foundry/assets/6439018/e99ac3a0-8654-4c8d-b7ec-5e6c237c1f26) Lion8B does not hurt convergence at all. Current impl is slightly slower.

can you show more of the error print out? I'm trying to figure out which file throws this error Note: for triton, you should install this version of it: `triton-pre-mlir@git+https://github.com/vchiley/triton.git@triton_pre_mlir_sm90#subdirectory=python`...

https://github.com/mosaicml/llm-foundry/issues/59 and https://github.com/mosaicml/llm-foundry/issues/88 discuss eval results. let us know if that answers your questions.

The ``` for name, non_tensor_value in object_state.non_tensors.items(): AttributeError: 'int' object has no attribute 'items' ``` issue is a known issue when using torch2 and the issue is fixed in composer's...

@tginart I've been able to run on the triton impl of flash attn (`attn_impl: triton`) A100s and H100s since [this](https://github.com/mosaicml/llm-foundry/pull/260) was merged. I think ppl have run it on A10s...

> Triton Test Failing I've messed about with Triton, but am no experts. Not sure why the test are or are not passing. We only use it for attn which...

All tests pass from last PR merge

With baseline usage, main and this branch have similar performance MFU is also effectively exactly the same the only diff is that the update uses slightly less sys mem https://wandb.ai/mosaic-ml/streaming051_updt

> It looks really strange, cause it's works fine with torch attention, can you help please? Can you clarify what you mean by "it's works fine with `torch` attention". You...

I just want to verify that you used the exact same configuration for all 3 runs and the only diff was the attn_impl