continue icon indicating copy to clipboard operation
continue copied to clipboard

[CON-228] "url" context provider

Open sestinj opened this issue 1 year ago • 3 comments

Validations

  • [X] I believe this is a way to improve. I'll try to join the Continue Discord for questions
  • [X] I'm not able to find an open issue that requests the same enhancement

Problem

The docs context provider will crawl an entire site and find multiple pages, but oftentimes you only want literally one page

Solution

No response

CON-228

sestinj avatar Apr 20 '24 01:04 sestinj

is the HTTP context provider what you are looking for ? https://github.com/continuedev/continue/blob/main/core/context/providers/HttpContextProvider.ts

oldluke92 avatar Apr 29 '24 15:04 oldluke92

@sestinj I think single-page-crawl should probably just be an option to set from within the AddDocsDialog? (rather than a whole new context provider)

I also see the case for a slider or input box to specify the max number of pages to crawl. Doesn't look like there is a max pages limit currently - I tried with a Wikipedia page and indeed that crawled indefinitely.

What do you think?

Update: I've just seen the url context provider description in the docs: "Type 'url' and input a URL, then Continue will convert it to a markdown document to pass to the model." That's different from indexing and saving it (what I inferred from this issue description)

I still think the addition of a limit/depth to the docs provider is a good idea!

Update 2: Ah I see you've already completed the url context provider here! Lemme know if a 'max number of pages' or 'max depth' parameter is something to add to the AddDocsDialog.

justinmilner1 avatar May 05 '24 23:05 justinmilner1

@justinmilner1 I think this would be nice. Max depth seems like the right thing, otherwise it's hard to prioritize one page vs. the other when they are same depth.

Let's add this option to SiteIndexingConfig and then instead of doing the silly thing that I did and pass a freely defined object over the webviewProtocol for index/addDocs (https://github.com/continuedev/continue/blob/preview/core/protocol.ts#L65-L66), let's pass a SiteIndexingConfig. Probably just use the URL input by the user as both starting and root URL so the UI remains simple.

sestinj avatar May 10 '24 21:05 sestinj