proxy_pool icon indicating copy to clipboard operation
proxy_pool copied to clipboard

好像没办法伪装成功,网页还是能抓取真实ip

Open ZhengHui-Z opened this issue 2 years ago • 4 comments

请大佬们帮忙看看,没办法伪装成功,网页还是能抓取真实ip!

测试代码:

import requests


session = requests.Session()

res = session.get("http://demo.spiderpy.cn/get/")
ans = res.json()
print(ans.get("region", None), ans.get("proxy", None))
if ans.get("https"):

    proxies = {'https': "https://{}".format(ans.get("proxy"))}
else:
    proxies = {'http': "http://{}".format(ans.get("proxy"))}

session.proxies = proxies
header = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76'
}

ip_res = session.get("http://mip.chinaz.com/", timeout=10, headers=header, verify=False)

print(ip_res.json().get("origin"))

ZhengHui-Z avatar Sep 08 '22 10:09 ZhengHui-Z

请大佬们帮忙看看,没办法伪装成功,网页还是能抓取真实ip!

测试代码:

import requests


session = requests.Session()

res = session.get("http://demo.spiderpy.cn/get/")
ans = res.json()
print(ans.get("region", None), ans.get("proxy", None))
if ans.get("https"):

    proxies = {'https': "https://{}".format(ans.get("proxy"))}
else:
    proxies = {'http': "http://{}".format(ans.get("proxy"))}

session.proxies = proxies
header = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76'
}

ip_res = session.get("http://mip.chinaz.com/", timeout=10, headers=header, verify=False)

print(ip_res.json().get("origin"))

再封装一层匿名代理,接着负载存活的代理池才行。大多数代理只是流量转发,在 proxy/xff字段上都能看到

xx-zhang avatar Sep 22 '22 05:09 xx-zhang

请大佬们帮忙看看,没办法伪装成功,网页还是能抓取真实ip! 测试代码:

import requests


session = requests.Session()

res = session.get("http://demo.spiderpy.cn/get/")
ans = res.json()
print(ans.get("region", None), ans.get("proxy", None))
if ans.get("https"):

    proxies = {'https': "https://{}".format(ans.get("proxy"))}
else:
    proxies = {'http': "http://{}".format(ans.get("proxy"))}

session.proxies = proxies
header = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76'
}

ip_res = session.get("http://mip.chinaz.com/", timeout=10, headers=header, verify=False)

print(ip_res.json().get("origin"))

再封装一层匿名代理,接着负载存活的代理池才行。大多数代理只是流量转发,在 proxy/xff字段上都能看到

是否有相关的测试Demo可以提供呢,非常感谢!

ZhengHui-Z avatar Sep 22 '22 06:09 ZhengHui-Z

请大佬们帮忙看看,没办法伪装成功,网页还是能抓取真实ip! 测试代码:

import requests


session = requests.Session()

res = session.get("http://demo.spiderpy.cn/get/")
ans = res.json()
print(ans.get("region", None), ans.get("proxy", None))
if ans.get("https"):

    proxies = {'https': "https://{}".format(ans.get("proxy"))}
else:
    proxies = {'http': "http://{}".format(ans.get("proxy"))}

session.proxies = proxies
header = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76'
}

ip_res = session.get("http://mip.chinaz.com/", timeout=10, headers=header, verify=False)

print(ip_res.json().get("origin"))

再封装一层匿名代理,接着负载存活的代理池才行。大多数代理只是流量转发,在 proxy/xff字段上都能看到

想问下双层代理怎么实现??

Kouuh avatar Oct 30 '22 15:10 Kouuh

请大佬们帮忙看看,没办法伪装成功,网页还是能抓取真实ip!

测试代码:

import requests


session = requests.Session()

res = session.get("http://demo.spiderpy.cn/get/")
ans = res.json()
print(ans.get("region", None), ans.get("proxy", None))
if ans.get("https"):

    proxies = {'https': "https://{}".format(ans.get("proxy"))}
else:
    proxies = {'http': "http://{}".format(ans.get("proxy"))}

session.proxies = proxies
header = {
    'Content-Type': 'application/x-www-form-urlencoded',
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76'
}

ip_res = session.get("http://mip.chinaz.com/", timeout=10, headers=header, verify=False)

print(ip_res.json().get("origin"))

import requests import warnings warnings.filterwarnings("ignore") import urllib3.contrib.pyopenssl urllib3.contrib.pyopenssl.inject_into_urllib3()

res = requests.get("http://demo.spiderpy.cn/get/") ans = res.json() print(ans.get("region", None), ans.get("proxy", None)) if ans.get("https"):

proxies = {'https': "https://{}".format(ans.get("proxy"))}

else: proxies = {'http': "http://{}".format(ans.get("proxy"))}

requests.proxies = proxies header = { 'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Mobile Safari/537.36 Edg/97.0.1072.76' }

ip_res = requests.get("http://mip.chinaz.com/", timeout=5, headers=header, verify=False)

print(ip_res.json().get("origin")) 这样 不要用一个会话

dubochao avatar Feb 16 '23 09:02 dubochao