arcee-python icon indicating copy to clipboard operation
arcee-python copied to clipboard

Bugfix: pass filters as a dict with Arcee SDK

Open rachittshah opened this issue 1 year ago • 0 comments

On passing filters as arguments for using arcee with Langchain, the filters need to be passed as a dict,

arcee = Arcee(
    model="DALM-PubMed",
    model_kwargs={
        "size": 10,  # The number of documents to inform the generation
        "filters": [
            {
                "field_name": "document",
                "filter_type": "fuzzy_search",
                "value": "neuroscience"
            }
        ]
    }
)

TypeError: Object of type DALMFilter is not JSON serializable

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/langchain/llms/arcee.py in _call(self, prompt, stop, run_manager, **kwargs)
    145             return self._client.generate(prompt=prompt, **kwargs)
    146         except Exception as e:
--> 147             raise Exception(f"Failed to generate text: {e}") from e
Exception: Failed to generate text: Object of type DALMFilter is not JSON serializable

Possible issues:

This is likely because the DALMFilter class does not have a method to convert its instances to a JSON serializable format.

Possible fixes:

Add handling to allow the dict to be passed in to a JSON serializable format.

class DALMFilter(BaseModel):
    """Filters available for a dalm retrieve/generation query

    Arguments:
        field_name: The field to filter on. Can be 'document' or 'name' to filter on your document's raw text or title
            Any other field will be presumed to be a metadata field you included when uploading your context data
        filter_type: Currently 'fuzzy_search' and 'strict_search' are supported. More to come soon!
            'fuzzy_search' means a fuzzy search on the provided field will be performed. The exact strict doesn't
            need to exist in the document for this to find a match. Very useful for scanning a document for some
            keyword terms
            'strict_search' means that the exact string must appear in the provided field. This is NOT an exact eq
            filter. ie a document with content "the happy dog crossed the street" will match on a strict_search of "dog"
            but won't match on "the dog". Python equivalent of `return search_string in full_string`
        value: The actual value to search for in the context data/metadata
    """

    field_name: str
    filter_type: FilterType
    value: str
    _is_metadata: bool = False

The issue mainly seems to be occurring due to how we're handling requests in make_request

def make_request(
    request: Literal["post", "get"],
    route: Union[str, Route],
    body: Optional[Dict[str, Any]] = None,
    params: Optional[Dict[str, Any]] = None,
    headers: Optional[Dict[str, Any]] = None,
) -> Dict[str, str]:
    """Makes the request"""
    headers = headers or {}
    internal_headers = {"X-Token": f"{config.ARCEE_API_KEY}", "Content-Type": "application/json"}
    headers.update(**internal_headers)
    url = f"{config.ARCEE_API_URL}/{config.ARCEE_API_VERSION}/{route}"

    req_type = getattr(requests, request)
    response = req_type(url, json=body, params=params, headers=headers)
    if response.status_code not in (200, 201):
        raise Exception(f"Failed to make request. Response: {response.text}")
    return response.json()

rachittshah avatar Nov 20 '23 04:11 rachittshah