llamafile icon indicating copy to clipboard operation
llamafile copied to clipboard

API endpoint to return llamafile version + other metadata

Open k8si opened this issue 1 year ago • 1 comments

It would be useful to have an API endpoint like /info that would return a json dict containing 1) the version of llamafile and 2) model metadata/configuration.

Llamafile version allows me to adapt client code to work with each release. E.g. I think the output of the /embedding endpoint changed between release v0.6.2 and what's currently on master. This will be a breaking change for my LlamaIndex integration -- v0.6.2 llamafiles will work but v0.6.3 llamafiles will not (in certain cases).

It would also be useful to have a way to get model metadata via an endpoint. E.g. user systems need to know the model's max sequence length in order to truncate/batch prompts appropriately. Currently there's no way to get this info.

k8si avatar Mar 07 '24 18:03 k8si

Rejigged your request to get a better sense what you are asking about. Hope it matches

Feature Request: API Endpoint for Metadata and Configuration

Request: Add an API endpoint /info that returns a JSON dictionary with:

  1. Llamafile Version: Allows client code to adapt to different releases, preventing integration issues (e.g., changes in /embedding endpoint output between v0.6.2 and later versions).
  2. Model Metadata: Provides details like the model's max sequence length, enabling user systems to truncate/batch prompts appropriately.

Use Case:

  • Version Information: Ensures compatibility across different releases.
  • Model Metadata: Helps manage input prompt length and optimize processing.

Currently, there's no way to retrieve this information via an API, leading to potential integration issues and inefficient prompt management. An /info endpoint would enhance adaptability and usability.

mofosyne avatar May 21 '24 16:05 mofosyne