gpt-researcher icon indicating copy to clipboard operation
gpt-researcher copied to clipboard

Researcher using links from previous calls

Open joeyDIGs opened this issue 7 months ago • 4 comments

Hi, We're having an issue where subsequent gpt_researcher runs have "spillover" of links/sources from previous runs/invocations.

Context

We're currently wrapping the gpt_researcher like so:

    async def get_report(sources: list, research_query_for_context_report: str, report_type: str = None, ) -> str:
        if not report_type:
            report_type = REPORT_TYPE
        researcher = GPTResearcher(query=research_query_for_context_report, report_type=report_type, source_urls=sources)
        await researcher.conduct_research()
        report = await researcher.write_report()
        return report

Then to call it, we're basically creating a pandas dataframe with a column for the sources, think of this

PK sources
1 [url1.com, url2.com, ...]
2 [abc1.com, abc2.com, ...]

The issue we're encountering, is that as we invoke our wrapper theoretically independently between rows

# the gist is something like this
workload_df = pd.DataFrame(data)
workload_df['gpt_researcher_report'] = workload_df.apply(lambda x: get_report(x.sources, "our hardcoded query")

Issue

The issue we're seeing is that the report for PK2 will include links/sources for PK1 both in the citation links at the bottom, but also for the logical content of the report generation - which is obviously creating erroneous results

Ask

Can someone in this wonderful community please help us understand where if at all this issue may be coming from?

  • are there local caches leveraged?
  • which objects may cause links from one invocation to express themselves in another?

joeyDIGs avatar Jul 16 '24 14:07 joeyDIGs