outlines icon indicating copy to clipboard operation
outlines copied to clipboard

Unify LogitsProcessors and `outlines.generate` Dispatchers

Open lapp0 opened this issue 1 year ago • 0 comments

What behavior of the library made you think about the improvement?

Currently we implement the same code in multiple places in the repo.

  • For each inference engine / model there are distinct Outlines integrations (good).
  • For each each model integration there is a distinct set of logits processor (addressed here)
  • For each model integration there is a distinct outlines.generate.* dispatch function (addressed here)

Having a distinct set of logits processors has resulted in some models lacking features they would otherwise have for free, and bugs due to discrepancies in implementation.

How would you like it to behave?

To avoid bugs, and make development easier we should handle any quirks of specific models implementations encapsulated within outlines.models, and allow the rest of the code base to be model agnostic.

https://github.com/outlines-dev/outlines/pull/956 re-introduces generic logits processors. They are designed to ensure any logits type (mx.array for mlx, np.array for llama-cpp, and torch.tensor for everything else) is efficiently cast to a torch.tensor allowing one torch-based logits processor to handle all logits processing work.

Resolving this issue involves updating outlines.generate such that all other models use these generic logits processors. This change should result in a major version update per https://github.com/outlines-dev/outlines/blob/main/docs/community/versioning.md as old logits processors will be removed.

Related

  • https://github.com/outlines-dev/outlines/issues/936
  • https://github.com/outlines-dev/outlines/issues/806

lapp0 avatar Jun 11 '24 19:06 lapp0