gpt-crawler
gpt-crawler copied to clipboard
Trying to Crawl site nothing working
Hello there Trying to crawl this site https://help.puzzlebot.top
Here is my config file
import { Config } from "./src/config";
export const defaultConfig: Config = {
url: "https://help.puzzlebot.top",
match: "https://help.puzzlebot.top/article**",
maxPagesToCrawl: 300,
outputFileName: "output.json",
maxTokens: 2000000,
};
but its crawl online its name what to do?
Thank you
This is because playwriter by default looks for anchor tags to identify other links to go to. But the website you have mentioned does not use tags to link to other pages, but uses event handler to go to other pages.
In short it is a shortcoming of the crawler and not this gpt-crawler.