AI Apprentice issues

Results 6 issues of


                                            AI Apprentice

Prompt caching

I saw other folks proposed the feature of caching overlapping prompts for reuse. For example, when the system prompt includes few-shot examples (long), encoding it every request is not efficient....

Machine learning models are forced to use single thread

Thank you for the great tool. I have an issue when implementing RQ in my project, where I use a Transformer model (Pytorch backend). Without RQ, the model is using...

I used Ctranslate2-quantized version of fastchat-t5 (https://huggingface.co/limcheekin/fastchat-t5-3b-ct2), as the LLM of a question answering system. The QA system is wrapped in Rest API. The model works really well. But an...

multiple init , version 1.16

Hi, After upgrading to version 1.16, I notice the __init__ is called multiple times. For example, here's the log ``` SampleWorker.__init__: 845e506f091b47a5ae50584a45c2ec4f ([], '845e506f091b47a5ae50584a45c2ec4f') {'connection': Redis, 'job_class': None, 'queue_class': None,...

Queue-Worker System

Thank you for the great package. I'm interested in hosting an LLM on GKE. For our existing ML applications, we usually implement a queue-worker system (e.g. redis-queue or redis-celery) to...

Weave gives Validation Error using Custom Chat Template

Thank you for the package. Love it. I use Langchain's ChatOpenAI function with `stream=True`. When I add Weave to trace the chains by simply `weave.init()', it throws the following error:...

AI Apprentice

Prompt caching

Machine learning models are forced to use single thread

Memory increase

multiple __init__ , version 1.16

Queue-Worker System

Weave gives Validation Error using Custom Chat Template

multiple init , version 1.16