Albert Zeyer

http://www.az2000.de [email protected]

Aachen, Germany

Results 963 comments of


                                            Albert Zeyer

Plan for packed dims

Note, for the TF layers backend, we had some partial support for this, but it was also quite problematic. It only was intended for the batch dim, i.e. a batch...

Plan for packed dims

Note, FlashAttention has `flash_attn_varlen_qkvpacked_func` ([API](https://github.com/Dao-AILab/flash-attention/blob/641db759ab7168e472909bc9ff1eda4a329de34f/flash_attn/flash_attn_interface.py#L1178C5-L1178C37)), with args: ``` qkv: (total, 3, nheads, headdim), where total = total number of tokens in the batch. cu_seqlens: (batch_size + 1,), dtype torch.int32. The...

Plan for packed dims

Also note, [FlexAttention](https://pytorch.org/blog/flexattention/) also seems to support this use case (check for "Document Masking"), and in a way that doesn't need recompilation when the seq lengths change (as they would...

Importing Sisyphus within IPython fails

I did some quick search for this error. But it's not really clear. E.g.: https://github.com/ipython/ipython/issues/14643 But this seems fixed/outdated, or only relevant for Python 3.9, and older IPython 8? Maybe...

Importing Sisyphus within IPython fails

Note, Gemini was helpful for a suggestion on a workaround: ```shell pip install nest_asyncio ``` And then in IPython: ```python import nest_asyncio nest_asyncio.apply() import sisyphus # Or import i6_experiments ```...

Importing Sisyphus within IPython fails

> So do we still need to take action here? I think it should be possible to import `sisyphus` within IPython (and maybe other places which would have similar conditions...

RF combine inconcistent between native and pure Python

Btw, I think I saw some similar problems before, where the native RF helpers would always behave like `allow_broadcast_all_sources=True`, but the pure Python logic does not. I thought I filled...

RF combine inconcistent between native and pure Python

Note, the reason g++ did not work here: Python was not found. One solution was `module load Python/3.12.3`, or probably also putting the right Python into the `$PATH`. But the...

PostprocessingDataset with multi-processing

One small problem might be the `rng` arg to `map_seq`, and how this should behave. I don't see a good way that this would be consistent to how it behaved...

PostprocessingDataset with multi-processing

Just to clarify: Which is the inner and which is the outer dataset? (Just edit your post.)

‹
1
2
...
88
89
90
91
92
93
94
95
96
97
›