tft-torch issues

Alternative attention implemenation

Change log: 1. Allow usage of torch's built in attention implementation when attention scores are not required.

Removed categorcal features count arguments

Change log: 1. Removed class arguments in TFT building blocks that use count of categorical features when cardinalities are available. Instead the amount is inferred from the number of carinalities,

shaharbar1

Redundant parameter in CategoricalInputTransformation

The parameter `num_inputs` is redundant and can be inferred directly from the `cardinalities` parameter,

shaharbar1

Usage of flash attention

1

Consider wrapping the call to self.attention in InterpretableMultiHeadAttention with `with torch.backends.cuda.sdp_kernel(enable_flash=True, enable_math=False, enable_mem_efficient=True):` In order to improve speed and memory efficiency.

shaharbar1

Regarding ensemble of attention score

2

You have mentioned in www.playtika-blog.com/playtika-ai/multi-horizon-forecasting-using-temporal-fusion-transformers-a-comprehensive-overview-part-2/, that "The different heads simply take care of the interactions between the Queries and the Keys, and the outputs of the heads are aggregated and...

nerdy314

tft-torch
tft-torch copied to clipboard

Metadata

Alternative attention implemenation

Removed categorcal features count arguments

Redundant parameter in CategoricalInputTransformation

Usage of flash attention

Regarding ensemble of attention score

← Metadata

Owner

Metadata

tft-torch tft-torch copied to clipboard

Metadata

Alternative attention implemenation

Removed categorcal features count arguments

Redundant parameter in CategoricalInputTransformation

Usage of flash attention

Regarding ensemble of attention score

← Metadata

Owner

Metadata

tft-torch
tft-torch copied to clipboard