price_monitor icon indicating copy to clipboard operation
price_monitor copied to clipboard

无货时依然会报 list index out of range

Open HonorWater opened this issue 5 years ago • 5 comments

日志如下:

2019-11-27 17:05:07-urllib3.connectionpool-Starting new HTTPS connection (1): www.amazon.cn:443 2019-11-27 17:05:07-urllib3.connectionpool-https://www.amazon.cn:443 "GET /dp/B07X4V2M3B HTTP/1.1" 200 2310 2019-11-27 17:05:07-root-list index out of range Traceback (most recent call last): File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 127, in monitor_amazon_china() File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 75, in monitor_amazon_china title = (selector.xpath('//span[@id="productTitle"]/text()')[0]).strip() IndexError: list index out of range 2019-11-27 17:05:07-root-Something wrong,retry count:0 2019-11-27 17:05:07-urllib3.connectionpool-Resetting dropped connection: qyapi.weixin.qq.com 2019-11-27 17:05:07-urllib3.connectionpool-https://qyapi.weixin.qq.com:443 "POST /cgi-bin/message/send?access_token=XXXXX HTTP/1.1" 200 44 2019-11-27 17:05:07-root-send_text_msg response:{"errcode":0,"errmsg":"ok","invaliduser":""} 2019-11-27 17:06:07-urllib3.connectionpool-Starting new HTTPS connection (1): www.amazon.cn:443 2019-11-27 17:06:08-urllib3.connectionpool-https://www.amazon.cn:443 "GET /dp/B07X4V2M3B HTTP/1.1" 200 2305 2019-11-27 17:06:08-root-list index out of range Traceback (most recent call last): File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 127, in monitor_amazon_china() File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 75, in monitor_amazon_china title = (selector.xpath('//span[@id="productTitle"]/text()')[0]).strip() IndexError: list index out of range 2019-11-27 17:06:08-root-Something wrong,retry count:1 2019-11-27 17:06:08-urllib3.connectionpool-https://qyapi.weixin.qq.com:443 "POST /cgi-bin/message/send?access_token=XXXXX HTTP/1.1" 200 44 2019-11-27 17:06:08-root-send_text_msg response:{"errcode":0,"errmsg":"ok","invaliduser":""} 2019-11-27 17:07:08-urllib3.connectionpool-Starting new HTTPS connection (1): www.amazon.cn:443 2019-11-27 17:07:08-urllib3.connectionpool-https://www.amazon.cn:443 "GET /dp/B07X4V2M3B HTTP/1.1" 200 2308 2019-11-27 17:07:08-root-list index out of range Traceback (most recent call last): File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 127, in monitor_amazon_china() File "D:\Tools\price_monitor-master\amazon_china_price_monitor.py", line 75, in monitor_amazon_china title = (selector.xpath('//span[@id="productTitle"]/text()')[0]).strip() IndexError: list index out of range 2019-11-27 17:07:08-root-Something wrong,retry count:2 2019-11-27 17:07:08-urllib3.connectionpool-https://qyapi.weixin.qq.com:443 "POST /cgi-bin/message/send?access_token=XXXXX HTTP/1.1" 200 44 2019-11-27 17:07:08-root-send_text_msg response:{"errcode":0,"errmsg":"ok","invaliduser":""} 2019-11-27 17:07:08-urllib3.connectionpool-https://qyapi.weixin.qq.com:443 "POST /cgi-bin/message/send?access_token=XXXXX HTTP/1.1" 200 44 2019-11-27 17:07:08-root-send_text_msg response:{"errcode":0,"errmsg":"ok","invaliduser":""}

HonorWater avatar Nov 27 '19 09:11 HonorWater

这次看起来像是获取商品标题的时候越界了。@HonorWater

jackleeforce avatar Dec 02 '19 01:12 jackleeforce

最后确认下来是被亚马逊反爬虫屏蔽了。出现越界是因为弹出了验证码确认页面。后来在网页手动输入了验证后,现在python去调用,返回的都是空了。

HonorWater avatar Dec 06 '19 01:12 HonorWater

最后确认下来是被亚马逊反爬虫屏蔽了。出现越界是因为弹出了验证码确认页面。后来在网页手动输入了验证后,现在python去调用,返回的都是空了。

你设置的频率是多少啊?

jackleeforce avatar Dec 06 '19 06:12 jackleeforce

用的你默认设置,没有修改。 我用的是阿里云服务器,估计阿里的IP段都被amazon盯上了,所以封的很紧

HonorWater avatar Dec 06 '19 06:12 HonorWater

用的你默认设置,没有修改。 我用的是阿里云服务器,估计阿里的IP段都被amazon盯上了,所以封的很紧

那估计是了,我用的GCP,从来没被封过,不过有的时候查出来的价格和中国这边访问页面不一样~~

jackleeforce avatar Dec 09 '19 02:12 jackleeforce