data-prep-kit icon indicating copy to clipboard operation
data-prep-kit copied to clipboard

[Feature] Create vector embeddings

Open Bytes-Explorer opened this issue 1 year ago • 1 comments
trafficstars

Search before asking

  • [X] I searched the issues and found no similar issues.

Component

Transforms/Other

Feature

The goal is to add a new module that can create embeddings from a given document. The module will take input parquet files, where each row may contain a chunk or the full document. For every row, it will convert the existing text to embeddings and add the embeddings as a new column in the parquet files.

This should be added as a new module along with the other language transforms here https://github.com/IBM/data-prep-kit/tree/dev/transforms/language

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Bytes-Explorer avatar Jul 25 '24 06:07 Bytes-Explorer

Done in https://github.com/IBM/data-prep-kit/pull/461

dolfim-ibm avatar Jul 31 '24 20:07 dolfim-ibm