colly
colly copied to clipboard
Basic usage only shows a single link, but site is full of them
While trying below code I only get a single link, but the website is full of links.
Visiting http://teenage.engineering
What is happening? Thanks for any hints.
package main
import (
"fmt"
"github.com/gocolly/colly"
)
func main() {
c := colly.NewCollector()
// Find and visit all links
c.OnHTML("a", func(e *colly.HTMLElement) {
e.Request.Visit(e.Attr("href"))
})
c.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL)
})
c.Visit("http://teenage.engineering")
}
The html content is inside a noscript tag and somehow the html parsing lib doesn't handle it. I need further investigation to fix this issue, thanks for reporting.
Some reference points
- https://github.com/gocolly/colly/issues/221
- https://github.com/PuerkitoBio/goquery/issues/139
- https://github.com/golang/go/issues/16318
Is this bug still reproducible?