wasi-nn
wasi-nn copied to clipboard
Add a `prompt` interface
Recent conversations (especially in the machine learning working group) suggest a need for a more specialized version of wasi-nn targeting LLMs. This change proposes such an interface, allowing users to access an ML graph directly using prompt strings instead of through the original tensor-based interface (i.e., inference).
Are you assuming that all LLMs have intrinsic tokenization? Not all foundation models are transforms from string to string.
Is the implication is that kv-cache and other stateful items will be kept opaquely by this context rather than maintained by the caller?