colly issues

Limit (or get) number of active requests

1

Hi there, In my case I have 500k urls that i'm going to crawl with gocolly. Is there a way to limit number of active goroutines somehow? I call `.Visit`...

vryazanov

question

can't get the data of response.Request.Body too

1

Hello,also I read [https://github.com/gocolly/colly/issues/445](445) and [https://github.com/gocolly/colly/issues/438](438),but i can't read Request.body string.And i try to put it CONTEXT,but it does not work. I look up a lot of information,the 'ioutil.ReadAll' is...

zerokeeper

question

Basic usage only shows a single link, but site is full of them

3

While trying below code I only get a single link, but the website is full of links. `Visiting http://teenage.engineering` What is happening? Thanks for any hints. ``` package main import...

mmmint

bug

Extract JS Code (Not execute)

5

I'm attempting to extract/locate Javascript code within an HTML page; whilst Colly is not a headless browser and hence, JS execution is not a feature, I don't actually need to...

pdavis156879

context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2

Hello, I use go-colly to crawl through website while with pagination. I set async to true and parallelism is set to 5. I also set timeout to 1 minute. Though...

ru90

Amazon Captcha Catches My Scraper

2

I did make Scraping for Amazon Product Titles but Amazon captcha catches my scraper. I tried 10 times with go run main.go(8 times catches me - 2 times I scrapped...

melissa9090

Modularize code / test it

Hello, First, great package! I'm trying to make a scraper to get, from various websites, some kind of images. It works great, I have basically a CLI and a go...

kinoute

upgrade Cascadia version

The new Cascadia version includes: - Case-insensitive CSS selectors attributes with `i`(https://github.com/andybalholm/cascadia/pull/51) - Support of a lot of pseudo-classes (https://github.com/andybalholm/cascadia/issues/50) Goquery has recently upgraded Cascadia too.

kinoute

How Can I log-in Amazon with Golang Colly

I am trying login to my amazon buyer account for getting tracking info. I made wordpress-woocommerce login and getting infos but I could not for Amazon. ``` package main import...

melissa9090

Add Gopher support?

2

Using https://github.com/prologic/go-gopher

prologic

colly
colly copied to clipboard

Metadata

Limit (or get) number of active requests

can't get the data of response.Request.Body too

Basic usage only shows a single link, but site is full of them

Extract JS Code (Not execute)

context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Amazon Captcha Catches My Scraper

Modularize code / test it

upgrade Cascadia version

How Can I log-in Amazon with Golang Colly

Add Gopher support?

← Metadata

Owner

Metadata

colly colly copied to clipboard

Metadata

← Metadata

Owner

Metadata

colly
colly copied to clipboard