ModelCache issues

feat: support huggingface/text-embeddings-inference for faster embedding inference

1

Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding,...

liwenshipro

Params not used in code

https://github.com/codefuse-ai/CodeFuse-ModelCache/blob/00fac9a61ac57ef90ad44c51e8e495e17dc893f3/modelcache/embedding/data2vec.py#L17C2-L22C1 param model not used and replaced in code.

liwenshipro

Is the project still being maintained, or are there any new plans for updates?

1

Is the project still being maintained, or are there any new plans for updates?

wongyan-data

Fix cache eviction and soft deleted using MySQL and Mivuls

**Abstract:** 1. Implemented soft delete and hard delete in MySQL. 2. Implemented a cache eviction strategy using MySQL and Mivuls. **Problems Solved:** 1. Multiple methods were not implemented, causing issues...

powerli2002

Can ModelChat be used in FastChat?

1

I am looking for a mean to use ModelChat in FastChat to speed up the LLM processes. Any pointer?

3togo

[Feature: Ranking ability] Add ranking model to refine the order of data after embedding recall

This issue is created to better track my PRs for Todo List [Rank ability] ## Background Efficiently retrieving relevant results from large-scale datasets plays a crucial role in software development...

isHuangXin

Insecure Deserialization Vulnerability in DataManager

Hi, I've discovered a critical vulnerability in the MapDataManager class where pickle.load is used to deserialize cached data from a file. The use of pickle is inherently unsafe as it...

EDMPL

提供OpenAI格式的查询接口

1

请问开发者，ModelCache可以直接提供标准的OpenAI格式的查询接口吗，比如原先程序是直接调用在线LLM的，直接替换接口链接和模型name，实现无缝接入到缓存当中。因为对于一些无法编辑改造查询方式的程序来说，可以直接通过替换OpenAI格式的模型接口，就可以实现直接接入ModelCache。因为缓存期待的是快速响应嘛，所以在没有命中缓存的情况下，调用在线LLM接口查询答案，并且以流式的形式返回从LLM拿到的答案，兼容这些特征。还有一个疑惑：如果历史上下文、prompt较长的情况下，是否会影响整体召回的准确度，有没有考虑将用户消息、prompt、历史上下文分别存储、计算向量呢。还是说我们查询ModelCache的时候应该尽量精简篇幅，只保留用户消息。希望获得解答。谢谢

hicofeng

add LLM in adapter and save query and answer

2

DreamCyc

add LLM in flask demo, to save query and answer not hit

DreamCyc

ModelCache
ModelCache copied to clipboard

Metadata

feat: support huggingface/text-embeddings-inference for faster embedding inference

Params not used in code

Is the project still being maintained, or are there any new plans for updates?

Fix cache eviction and soft deleted using MySQL and Mivuls

Can ModelChat be used in FastChat?

[Feature: Ranking ability] Add ranking model to refine the order of data after embedding recall

Insecure Deserialization Vulnerability in DataManager

提供OpenAI格式的查询接口

add LLM in adapter and save query and answer

add LLM in flask demo, to save query and answer not hit

← Metadata

Owner

Metadata

ModelCache ModelCache copied to clipboard

Metadata

← Metadata

Owner

Metadata

ModelCache
ModelCache copied to clipboard