dspy icon indicating copy to clipboard operation
dspy copied to clipboard

long_text attribute for retriever passages

Open andreapiso opened this issue 1 year ago • 11 comments

Hi, I am creating a custom retriever, inheriting from dspy.Retrieve and overloading the __init__ and forward method.

Right now I am receiving an error on "long_text" not being present in the passages that my retriever is generating:

File ~/miniconda3/lib/python3.10/site-packages/dsp/primitives/search.py:10, in <listcomp>(.0)
      8     raise AssertionError("No RM is loaded.")
      9 passages = dsp.settings.rm(query, k=k, **kwargs)
---> 10 passages = [psg.long_text for psg in passages]
     12 if dsp.settings.reranker:
     13     passages_cs_scores = dsp.settings.reranker(query, passages)

AttributeError: 'str' object has no attribute 'long_text'

I am confused because I assumed that passages would be a list of strings, but it does not look like it's the case. However, when I look at the pinecone retriever, which is what I am using as a reference to implement mine, It does not look like it is using the "long_text" field either.

https://github.com/stanfordnlp/dspy/blob/main/dspy/retrieve/pinecone_rm.py

What am I missing?

andreapiso avatar Oct 14 '23 06:10 andreapiso

I am encountering the same.

wicusverhoef avatar Oct 19 '23 13:10 wicusverhoef

Same here

jamesliu avatar Oct 23 '23 06:10 jamesliu

@okhat Could you help answer this question? Thanks a lot!

cyyeh avatar Jan 26 '24 08:01 cyyeh

Just had the same issue. This fixed it:

wiki_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts') dspy.settings.configure(lm=ollama_model, rm=wiki_abstracts)

vaaale avatar Feb 17 '24 16:02 vaaale

Facing the same issue

ac-sagarmathpal avatar Feb 26 '24 03:02 ac-sagarmathpal

I was able to resolve by adding

if hasattr(passages, 'passages'):
    passages = passages.passages

after this line https://github.com/stanfordnlp/dspy/blob/42a5943379d28d1673dc8fe332a3d596efdfc7a3/dsp/primitives/search.py#L12

It seems that we are getting a prediction object as the return of passages = dsp.settings.rm(query, k=k, **kwargs)

koshyviv avatar Feb 28 '24 11:02 koshyviv

I don't know why, but retrieve.py expects a list of dictionaries from the custom retriever. This works for me:

    def forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None) -> dspy.Prediction:
        context = ['foo', 'bar']
        return [dotdict({"long_text": passage}) for passage in context]

fabiannagel avatar Mar 08 '24 13:03 fabiannagel

I think there are a few parts important for this to work:

  • dsp.search.retrieve: Calls the RetrieveModel and expects a list of objects with the key 'long_text'
  • dspy.retrieve.Retrieve.forward: The return type is Prediction, which contains a key 'passages'
  • Example RetrieveModels like Weaviate return the list of objects as expected by the retrieve function, but they should return Prediction.

A change is need to use the Prediction object in the retrieve function, and all Retrievers should return this object. Or we mist change the return type.

I did not check what other functions use this forward method. What I have seen sofar, we do not want to change the interface, so improving the Retrievers and the retrieve function feels more logical.

jettro avatar Mar 11 '24 13:03 jettro

Is there a abstract class or object for this passage in the dspy framework? ;) it would be nice, becuase when we implement the custom retriver we can get the expected signature from the framework so...what I want to say if the framework (DSPy) works with this Passage API like this:

class Passage(ABC): long_text: str

then this should be define herem and external tools, framwroks can interface with this over an adapter if neccesary ;) in my case I would just wire it :)

anyway, thanks for the info...I think i can fix this issue on my end

fireking77 avatar Apr 11 '24 17:04 fireking77

I just ran into this as well The examples the docs give for a custom RM lead to this exact issue

ethanniser avatar Jul 08 '24 21:07 ethanniser