PowerInfer icon indicating copy to clipboard operation
PowerInfer copied to clipboard

how to show the outputs result on a web-service? or how can i get the result of inferrence for other application?

Open xujiangyu opened this issue 1 year ago • 3 comments

Prerequisites

Before submitting your question, please ensure the following:

  • [x] I am running the latest version of PowerInfer. Development is rapid, and as of now, there are no tagged versions.
  • [x] I have carefully read and followed the instructions in the README.md.
  • [ ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).

Question Details

Please provide a clear and concise description of your question. If applicable, include steps to reproduce the issue or behaviors you've observed.

Additional Context

Please provide any additional information that may be relevant to your question, such as specific system configurations, environment details, or any other context that could be helpful in addressing your inquiry.

xujiangyu avatar Jan 22 '24 03:01 xujiangyu

Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.

For other applications, most of them print the inference results in the command line. You can find usage instructions in the examples/[application] directory, where each application's README and source code are available.

hodlen avatar Jan 22 '24 18:01 hodlen

Hi @xujiangyu ! If you are referring to the examples/server application, you can access it by entering the server address (e.g., 127.0.0.1:8080) in your browser. This allows you to interact with the model via a simple UI and see the outputs. For more details, please refer to the server documentation. Additionally, all inference outputs from the server are also printed to stdout.

For other applications, most of them print the inference results in the command line. You can find usage instructions in the examples/[application] directory, where each application's README and source code are available.

Thank you for your reply. I wonder how to add background knowledge in the parameters ,such as for RAG flow. I check the parameters of the main func and didn't recognise such a specific parameter.

xujiangyu avatar Jan 23 '24 02:01 xujiangyu

Adding background knowledge is quite an application layer concept and is no more than injecting information in prompts. This project focuses on the LLM inference and doesn't provide convenient support for that.

I suggest using some wrappers like the llama-cpp-python library (you can use our forked version here), or the server endpoint. And then you can use any mainstream orchestration frameworks like LangChain to easily achieve the RAG workflow.

hodlen avatar Jan 24 '24 15:01 hodlen