Daniël de Kok

Results 73 issues of Daniël de Kok

Most of these optimizations should be easy to add: https://pytorch.org/blog/accelerating-generative-ai-2/

feat/model
feat/generation

Work in progress branch: https://github.com/shadeMe/curated-transformers/tree/staging/feature/deberta-encoder Related to #347.

type/feature
feat/model

Add support for the Mistral architecture. Work-in-progress branch: https://github.com/danieldk/curated-transformers/tree/feature/mistral

type/feature
feat/model

See https://arxiv.org/abs/2309.17453

type/feature
feat/model
feat/layers

In many applications we only need the last layer and letting go of references to intermediate layers can save some memory during inference.

type/feature
feat/model

Some models deviate so much from standard transformer encoder/decoders (e.g. DeBERTa and Falcon old architecture) that we probably should not support them in mainline Curated Transformers to avoid cluttering the...

type/maintenance

Expose more useful outputs, such as logits through the `Generator` interface. Also fixes #311 .

type/feature
feat/generation