renrenBackup
renrenBackup copied to clipboard
json.decoder.JSONDecodeError
Describe the bug
我用以下命令去备份我的个人内容
python manage.py fetch -p *** -e *** -s -g -a -b
前面的下载基本上都正常,但是进行到如下状态后,就报错了
fetch album 311698393 2008.18 (), 评0/分0/赞0
Traceback (most recent call last):
File "manage.py", line 158, in <module>
cli()
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "manage.py", line 53, in fetch
fetched = fetch_user(
File "/Users/Kinggerm/Downloads/renrenBackup/fetch.py", line 99, in fetch_user
fetch_album(uid)
File "/Users/Kinggerm/Downloads/renrenBackup/fetch.py", line 71, in fetch_album
album_count = crawl_album.get_albums(uid)
File "/Users/Kinggerm/Downloads/renrenBackup/crawl/album.py", line 163, in get_albums
count, after = get_album_list_page(uid, after)
File "/Users/Kinggerm/Downloads/renrenBackup/crawl/album.py", line 153, in get_album_list_page
get_album_summary(aid, uid)
File "/Users/Kinggerm/Downloads/renrenBackup/crawl/album.py", line 73, in get_album_summary
album_data = crawler.get_json(
File "/Users/Kinggerm/Downloads/renrenBackup/crawl/crawler.py", line 178, in get_json
r = json.loads(resp.text.replace(",}", "}"))
File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
To Reproduce 重新运行原命令可在原位复现,但是账号和密码不太方便透露
@Kinggerm 目前人人没法方便查看其他人的相册,不太好确切定位问题
盲猜一是相册名里有一些特殊字符,解析出错,二是之前人人返回的 json 并非标准格式,所以做了替换处理,而原相册名(或相册描述)里刚好有命中了这个坑
File "/Users/Kinggerm/Downloads/renrenBackup/crawl/crawler.py", line 178, in get_json
r = json.loads(resp.text.replace(",}", "}"))
建议可以对 2008.18
这个相册的相册名和描述做一些改动,去掉特殊字符,和标点符号,应该是可以继续跑下去的
现在人人好像没法做任何改动了?我从web端登录,只能浏览之前的内容;手机端我搜不到APP了
另外,非常感谢这个Repo的开发者们,到现在还在为了像我这样的迟到备份者而努力,感慨人人之余很是感动!
@Kinggerm 按 #65 里提及,如果是公开相册,其他人还能抓,你可以提供下你的 uid 让其他人帮检查下,不用你的账号密码,前面你贴出来那个报错信息里只有相册号没有 uid,不太方便重入测试
@Kinggerm 我做了如下尝试,还需要你提供更多信息,才能定位问题去修复或绕过
- 直接通过 Web 端查看指定相册
访问 http://www.renren.com/album/{album_id} ,用你出错信息里的 311698393 套进去提示没有权限
- 改代码直接抓指定相册
使用你出错信息里的 311698393 去抓,返回的 json 是 {"errorCode":2010500,"errorMsg":":( cause: java.lang.NullPointerException","server_time":1660186496528}
所以无法正常解析
你可以自己改下代码,在 crawl/crawler.py
的 174 行后加一句
logger.info("get json: {r}".format(r=resp.text))
然后再运行,把报错时的上下文给出来
鉴于此问题无法由其他人复现去测试修复,且 @Kinggerm 未给出后续信息,本 issue 先关闭,如有更新重新再开启记录