openai-java icon indicating copy to clipboard operation
openai-java copied to clipboard

Add support for passing dimension size while calling OpenAI's create dimension API

Open prabhupant opened this issue 1 year ago • 2 comments

Currently in the EmbeddingRequest, we cannot specify the dimension size.

package com.theokanning.openai.embedding;

import lombok.*;

import java.util.List;

/**
 * Creates an embedding vector representing the input text.
 *
 * https://beta.openai.com/docs/api-reference/embeddings/create
 */
@Builder
@NoArgsConstructor
@AllArgsConstructor
@Data
public class EmbeddingRequest {

    /**
     * The name of the model to use.
     * Required if using the new v1/embeddings endpoint.
     */
    String model;

    /**
     * Input text to get embeddings for, encoded as a string or array of tokens.
     * To get embeddings for multiple inputs in a single request, pass an array of strings or array of token arrays.
     * Each input must not exceed 2048 tokens in length.
     * <p>
     * Unless you are embedding code, we suggest replacing newlines (\n) in your input with a single space,
     * as we have observed inferior results when newlines are present.
     */
    @NonNull
    List<String> input;

    /**
     * A unique identifier representing your end-user, which will help OpenAI to monitor and detect abuse.
     */
    String user;
}

So the embedding size that we will get will always be 1536. If we want to generate embeddings with a reduced size, currently this is not possible

prabhupant avatar Feb 20 '24 04:02 prabhupant

A shameless plug, but I regularly update another Java OpenAI library - https://github.com/StefanBratanov/jvm-openai . It has the ability to specify the dimension for the Embedding requests.

StefanBratanov avatar Feb 20 '24 08:02 StefanBratanov

If anyone hasn't taken up this issue, I am working on it. Will be submitting the pull request shortly.

prabhupant avatar Feb 21 '24 06:02 prabhupant