Python
Python copied to clipboard
拉勾做了防爬虫处理了
拉勾做了防爬虫处理了
我想这样处理应该就好了 s = requests.Session() urls = 'https://www.lagou.com/jobs/list_Python?labelWords=&fromSearch=true&suginput=' # 获取搜索页的cookies s.get(urls, headers=headers, timeout=3) # 为此次获取的cookies cookie = s.cookies # 获取此次文本 json = s.post(url, data=data, headers=headers, cookies=cookie, timeout=5).json()
好像还是不太行呀,拉勾显示爬取的页面是这样的 {"status":false,"msg":"您操作太频繁,请稍后再访问","clientIp":"39.187.59.216","state":2402},我傻了
ip➕延迟试试 ------------------ 原始邮件 ------------------ 发件人: "吴康惠"[email protected] 发送时间: 2020年3月4日(星期三) 晚上10:05 收件人: "injetlee/Python"[email protected]; 抄送: "Subscribed"[email protected]; 主题: Re: [injetlee/Python] 拉勾做了防爬虫处理了 (#18)
好像还是不太行呀,拉勾显示爬取的页面是这样的 {"status":false,"msg":"您操作太频繁,请稍后再访问","clientIp":"39.187.59.216","state":2402},我傻了
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.