colly icon indicating copy to clipboard operation
colly copied to clipboard

Elegant Scraper and Crawler Framework for Golang

Results 155 colly issues
Sort by recently updated
recently updated
newest added

``` go c := colly.NewCollector(colly.Async(true)) storage := &redisstorage.Storage{ Client: cache.RDB, Prefix: "test", } err := c.SetStorage(storage) if err != nil { panic(err) } if err := storage.Clear(); err != nil...

``` ================== WARNING: DATA RACE Write at 0x00c0000984d8 by goroutine 31: github.com/gocolly/colly/v2.(*httpBackend).Do() github.com/gocolly/colly/[email protected]/http_backend.go:190 +0x96f github.com/gocolly/colly/v2.(*httpBackend).Cache() github.com/gocolly/colly/[email protected]/http_backend.go:134 +0xad github.com/gocolly/colly/v2.(*Collector).fetch() github.com/gocolly/colly/[email protected]/colly.go:675 +0x559 Previous read at 0x00c0000984d8 by goroutine 39: net/http.(*persistConn).readLoop() net/http/transport.go:2214 +0xd6b...

bug

As of now, Colly parses URLs with Go stdlib's `net/url` parser. This parser is somewhat simple, and doesn't do some quirks that browsers do. Since Colly is a web crawling...

bug
enhancement

Using defers to unlock mutexes is safer. If the code holding the mutex ever panics (eg. a bug is introduced) `defer` will guarantee that it gets unlock and Collector will...

When using Colly with Async turned off, the crawler works as expected - though slow as expected, about one item per second. However, enabling Async mode quickly turns into a...

Hi go-colly team, Is there any way to scrap website that needs to log in first using github account?

question

Question: I've to scrape different 10+ blogs for articles. I've to scrape fields like title, author, likes, content etc. but each site would have different css selector for the fields...

question

I don't see an API for this besides `Collector.OnResponse`. Curious if its possible to scrap `plain/text` files? My use-case is scraping a bunch of hosted text files that _may_ have...

enhancement
question

In instagram example on line https://github.com/gocolly/colly/blob/master/_examples/instagram/instagram.go#L47 in structure tag is colon instead of double quote

bug
example