langchain
langchain copied to clipboard
Support Codex embeddings
Current implementation OpenAI embeddings are hard coded to return only text embeddings via. GPT-3. For example,
def embed_documents(self, texts: List[str]) -> List[List[float]]:
...
responses = [
self._embedding_func(text, engine=f"text-search-{self.model_name}-doc-001")
for text in texts
]
def embed_query(self, text: str) -> List[float]:
...
embedding = self._embedding_func(
text, engine=f"text-search-{self.model_name}-query-001"
)
However, recent literature on reasoning shows CODEX to be more powerful on reasoning tasks than GPT-3. OpenAIEmbeddings
should be modified to support both text and code embeddings.
Probably worth deprecating now that codex support is being killed
Hi, @delip! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.
From what I understand, you opened this issue suggesting an update to the OpenAIEmbeddings
to support both text and code embeddings, as recent literature suggests that CODEX is more powerful for reasoning tasks. However, there has been a comment from CalmDownKarm suggesting that the issue may be deprecated now that codex support is being killed.
Based on this information, it seems that the issue has been resolved. The LangChain team has decided that the update to OpenAIEmbeddings
is unnecessary due to the removal of codex support.
If you believe that this issue is still relevant to the latest version of the LangChain repository, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project! Let us know if you have any further questions or concerns.