t2v-transformers-models issues

Need to able to change running port in dockerfile by a flag

Can we add a flag saying --port that can change running port, as if weaviate is using 8080, this is breaking

High (100%) CPU Utilization for Transformer Container. ThreadPoolExecutor not using max_workers defined.

Hi I am trying to run the transformers in kubernetes pod on my instance and its using up all the CPU cores available and throttling other pods in the cluster...

pchunduru10

Model Name or Path in Meta Info

Is there a way to get the name or path of the model being used when getting the meta information? For example, I'm looking in the ./models/model/ directory of the...

CaseyHaralson

Error with SentenceTransformer

2

With the new change introduced by the [SentenceTransformer PR](https://github.com/weaviate/t2v-transformers-models/pull/67), I run into an issue when building the Docker image for this repo. Specifically, I am using `MODEL_NAME=hkunlp/instructor-xl` and the error...

axeloh

Direct tokenization

3

I had an issue with the t2v-transformers today: I create embeddings using a sentence-transformers model. One time using the sentence-transformers python library and one time using the t2v-transformers container. The...

kl-thamm

Support for batch inference

1

Is there any plan to support batch inference, instead of single input ?

MoaazZaki

ghost

Batching along different texts, producing one vector per text

Does the logic here implement batch processing of many different independent texts (i.e. independent entries in the Weaviate database)? I see batching in the sense of splitting a text into...

JubilantJerry

t2v-transformers-models
t2v-transformers-models copied to clipboard

Metadata

Need to able to change running port in dockerfile by a flag

High (100%) CPU Utilization for Transformer Container. ThreadPoolExecutor not using max_workers defined.

Model Name or Path in Meta Info

Error with SentenceTransformer

Direct tokenization

Support for batch inference

The tokenizer max length should be equal to model max length

Is there a way to do less computation for T5 and DPR models?

Multi-GPU support

Batching along different texts, producing one vector per text

← Metadata

Owner

Metadata

t2v-transformers-models t2v-transformers-models copied to clipboard

Metadata

← Metadata

Owner

Metadata

t2v-transformers-models
t2v-transformers-models copied to clipboard