DeepXi icon indicating copy to clipboard operation
DeepXi copied to clipboard

Can the MHANet run in real time

Open hopkin-ghp opened this issue 2 years ago • 4 comments

Hi,

I am confused whether the MHANet works in real time. From my understrand, the masked attention only match causal scenario, may be not applicable to real tme.

Best Regards, looking forward to your reply.

hopkin-ghp avatar Feb 01 '23 09:02 hopkin-ghp

I did not get the chance to develop the model to run on a real-time system.

It would need some more development, but I assume its possible. You could do things like reuse past keys and queries for the attention mechanism to speed up processing times and determine a window of time-steps for the model that will allow it to be run fast enough on a device such that it is real time. So a few compromises would need to be made I assume. Also, a device with a GPU would make things much easier.

Maybe a paper like this could give you some ideas: https://arxiv.org/abs/2010.11395

I could be wrong, but I am sure it is very possible with some modifications.

Aaron.

anicolson avatar Feb 01 '23 21:02 anicolson

Yes, i also think its possible that model run on a real-time system.

a) For a masked attention matrix(full history, 0 lookahead), like 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 1 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 1 1 I think it's different in traner and inferencer.

b) For a masked attetnion matrix(N history, 0 lookahead), in which N is the window size, if N=3, we can get 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1 1 But i dont sure whether its suitable for real-time systems. specially, is it possible that such training method(b) can be applied to reasoning of streaming audio.

Thanks.

hopkin-ghp avatar Feb 02 '23 02:02 hopkin-ghp

Sounds like an interesting problem to investigate :) I am sure it could work with some constraints. Consider things like using previously computed keys to speed up processing, e.g., this is done with language models when generating text to speed up decoding: https://github.com/huggingface/transformers/blob/820c46a707ddd033975bc3b0549eea200e64c7da/src/transformers/models/gpt2/modeling_gpt2.py#L984

anicolson avatar Feb 02 '23 02:02 anicolson

Thanks, i will learn relevant knowledge.

hopkin-ghp avatar Feb 02 '23 06:02 hopkin-ghp