colly icon indicating copy to clipboard operation
colly copied to clipboard

Basic usage only shows a single link, but site is full of them

Open mmmint opened this issue 4 years ago • 3 comments

While trying below code I only get a single link, but the website is full of links.

Visiting http://teenage.engineering

What is happening? Thanks for any hints.

package main

import (
	"fmt"
	"github.com/gocolly/colly"
)

func main() {

	c := colly.NewCollector()

	// Find and visit all links
	c.OnHTML("a", func(e *colly.HTMLElement) {
		e.Request.Visit(e.Attr("href"))
	})

	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL)
	})

	c.Visit("http://teenage.engineering")
}

mmmint avatar Apr 13 '20 01:04 mmmint

The html content is inside a noscript tag and somehow the html parsing lib doesn't handle it. I need further investigation to fix this issue, thanks for reporting.

asciimoo avatar Apr 17 '20 14:04 asciimoo

Some reference points

  • https://github.com/gocolly/colly/issues/221
  • https://github.com/PuerkitoBio/goquery/issues/139
  • https://github.com/golang/go/issues/16318

mikestead avatar Jul 29 '21 00:07 mikestead

Is this bug still reproducible?

anthonygedeon avatar Oct 28 '21 06:10 anthonygedeon