captcha_spider
captcha_spider copied to clipboard
简易验证码爬虫框架
ç®æéªè¯ç ç¬è«æ¡æ¶
ä¸è¬çéªè¯ç è·åæµç¨å为ä¸å¤§æ¥éª¤
- å置请æ±, è·åéªè¯ç ç¸å ³åæ°
- éªè¯ç 请æ±, è·åéªè¯ç
- æ ¡éªè¯·æ±, éè¿å®ç½å¤å®éªè¯ç æ¯å¦æ£ç¡®
éè¿ç»§æ¿ Project ç±»å®ç°å
·ä½çæµç¨ utils.Project
def before_process()-> dict, è¿åå ¶ä»æµç¨éè¦çåæ°åå ¸, éè¿self.before_params访é®def captcha_process() -> bytes, è¿åéªè¯ç å¾çbytesdef feedback_process() -> bool, è¿åéªè¯ç å馿 åµï¼æ¯å¦æ£ç¡®
è¥é常信任éªè¯ç è¯å«ææä¸åå¤åéªè¯æ¥éª¤å¯ä»¥æå¦ä¸æ¹å¼å®ç°ï¼
from utils import Project, ServiceType, Charset
project = Project(
captcha_length=4,
captcha_charset=Charset.ALPHABET,
service_type=ServiceType.Kerlomz,
captcha_url="https://en.exmail.qq.com/cgi-bin/getverifyimage"
)
project.start(1000)
å¨ const.json æä»¶ä¸è¡¥å èªå·±ç èä¼è´¦å· å ç¾åº¦API ä»¥åæ ·æ¬ä¿åçè·¯å¾
{
"baidu": {
"app_id": "app_id",
"api_key": "api_key",
"secret_key": "secret_key"
},
"lianzhong": {
"username": "username",
"password": "password"
},
"target_dir": "D:/Samples"
}
ç¼åæµç¨ï¼
- è¡¥å const.json
- å¨spiderså ä¸é¢æ°å»ºèªå·±çç¬è«å¯ä»¥åèdemo.py
- å¨app.py䏿§è¡
è¯¥æ¡æ¶ä¼æ§è¡æ´ä¸ªç¬è«åæ ¡éªæµç¨ï¼å¯¹æ¥èä¼å¹³å°å¦æè¯å«é误ä¼èªå¨è°ç¨éè¯¯ä¸æ¥æ¥å£è¿è¿ç¹æ°ï¼æ¡æ¶ä¸ºäºæ¹ä¾¿å¼åå¦ä¹ 使ç¨ï¼è¯·å¿ç¨äºéæ³éå¾ãdemo.py ä¾åä¸é对任ä½ç½ç«ã