patchwork
patchwork copied to clipboard
Enhance embeddings
Added more language files.
@CTY-git can we add a flag that will re-embed the repository? At the moment, if the embedding already exists and we switch the embedding model it fails with an error. The only way to allow re-embedding is to manually delete the chroma DB file from the ~/. folder. In such cases, having a flag would be useful it will basically just delete the existing embedding and re-embed the repo. WDYT?
The code changes involve updating the prompt.json file to include detailed instructions for resolving code issues, adding support for new programming languages and a feature to disable caching in code repository embeddings. Additionally, the typed.py file was modified to import 'NotRequired' instead of 'TypedDict' and add a 'disable_cache' field to the GenerateCodeRepositoryEmbeddingsInputs class, while also importing the NotRequired class and adding a 'disable_cache' attribute with a default value to the GenerateEmbeddingsInputs class. In GenerateEmbeddings.py, a new function delete_collection
was added to delete collections, and parameters chunk_size
and overlap_size
are now retrieved from the inputs dictionary for the split_text
function.
- File changed: patchwork/patchflows/ResolveIssue/ResolveIssue.py
The code change in this diff involves updating the "response_partitions" key in the inputs dictionary to include specific partition elements for a patch, such as "Fixed Code:", "
", newline character, and "
". This change is made to prepare a prompt for resolving an issue within the code.
- File changed: patchwork/patchflows/ResolveIssue/prompt.json The diff in the prompt.json file updates the system role content by providing more detailed instructions for generating a fix based on user-provided code snippets, clarifying the response format and adding a condition to return "<NO FIX POSSIBLE>" if the bug cannot be resolved by modifying the given code.
- File changed: patchwork/steps/GenerateCodeRepositoryEmbeddings/GenerateCodeRepositoryEmbeddings.py The diff adds support for additional programming languages (e.g., OCaml, F#, Haskell) as valid file extensions for code repository embeddings generation. It also includes a new feature allowing users to disable caching of certain data during the process.
- File changed: patchwork/steps/GenerateCodeRepositoryEmbeddings/typed.py The code in the file typed.py was modified to import 'NotRequired' from typing_extensions instead of 'TypedDict'. Additionally, a new field 'disable_cache' of type NotRequired[bool] was added to the class GenerateCodeRepositoryEmbeddingsInputs.
- File changed: patchwork/steps/GenerateEmbeddings/GenerateEmbeddings.py
A new function
delete_collection
has been added to the GenerateEmbeddings.py file to delete a collection based on a given collection name. Additionally, thechunk_size
andoverlap_size
parameters are now being retrieved from theinputs
dictionary to be used in thesplit_text
function.