macOS 下没有自动 Grab
作者你好!
经过对代码的兼容性修改(已发 pull request),macOS 目前已经可以通过点击 Actions 里的 Grab Now 按钮成功获取到数据了。
但是看下来没有自动 Grab,GrabResult 是空,Status 是 ON,Log 也是空,请教这个问题需要从何处查起呢?
是单机使用的。
还望指教一二,谢谢!
Log 有了
Log 有了
2018-11-21 20:49:47,809 [1] INFO - 127.0.0.1:36000 feed scheduler starting 2018-11-21 20:49:47,819 [1] INFO - 127.0.0.1:36000 feed scheduler started 2018-11-21 20:49:47,821 [1] INFO - Start WebApiServer At http://127.0.0.1:36000 with STANDALONE node 2018-11-21 20:49:48,983 [4] INFO - 127.0.0.1:36000 add job with feed id 5 2018-11-21 20:49:48,996 [4] INFO - 127.0.0.1:36000 add job with feed id 3 2018-11-21 20:49:49,006 [4] INFO - 127.0.0.1:36000 add job with feed id 11 2018-11-21 20:49:49,013 [4] INFO - 127.0.0.1:36000 add job with feed id 1 2018-11-21 20:49:49,023 [4] INFO - 127.0.0.1:36000 add job with feed id 2 2018-11-21 20:49:49,032 [4] INFO - 127.0.0.1:36000 add job with feed id 4 2018-11-21 20:49:49,040 [4] INFO - 127.0.0.1:36000 add job with feed id 12 2018-11-21 20:49:49,040 [4] INFO - 127.0.0.1:36000 sync feed and add feed jobs:7 2018-11-21 20:49:49,042 [4] INFO - 127.0.0.1:36000 add extract job 2018-11-21 20:50:00,069 [14] INFO - feed job feed127.0.0.1:36000.3 add to feed crawl queue 2018-11-21 20:50:00,069 [13] INFO - feed job feed127.0.0.1:36000.12 add to feed crawl queue 2018-11-21 20:50:00,069 [15] INFO - feed job feed127.0.0.1:36000.2 add to feed crawl queue 2018-11-21 20:50:00,069 [5] INFO - feed job feed127.0.0.1:36000.11 add to feed crawl queue 2018-11-21 20:50:00,069 [9] INFO - feed job feed127.0.0.1:36000.1 add to feed crawl queue 2018-11-21 20:50:00,069 [4] INFO - feed job feed127.0.0.1:36000.5 add to feed crawl queue 2018-11-21 20:50:00,096 [4] INFO - feed job http://www.jiuxian.com/goods-55611.html?source=92 starting 2018-11-21 20:50:00,096 [15] INFO - feed job https://www.kuaidaili.com/free/inha/1/ starting 2018-11-21 20:50:00,096 [9] INFO - feed job https://www.oschina.net/blog starting 2018-11-21 20:50:00,096 [5] INFO - feed job http://www.ruijihg.com/爬虫 starting 2018-11-21 20:50:00,098 [4] INFO - do task -> request address http://www.jiuxian.com/goods-55611.html?source=92 2018-11-21 20:50:00,098 [15] INFO - do task -> request address https://www.kuaidaili.com/free/inha/1/ 2018-11-21 20:50:00,098 [9] INFO - do task -> request address https://www.oschina.net/blog 2018-11-21 20:50:00,098 [5] INFO - do task -> request address http://www.ruijihg.com/爬虫 2018-11-21 20:50:00,098 [13] INFO - begin move delay feed 2018-11-21 20:50:00,102 [13] INFO - get snapshot feed count:0 2018-11-21 20:50:00,104 [13] INFO - feed job http://press.gapp.gov.cn:8088/press_search/pages/query/queryAction!findmediaPaging.action starting 2018-11-21 20:50:00,104 [13] INFO - do task -> request address http://press.gapp.gov.cn:8088/press_search/pages/query/queryAction!findmediaPaging.action 2018-11-21 20:50:00,169 [4] INFO - request http://www.jiuxian.com/goods-55611.html?source=92 response code is BadRequest 2018-11-21 20:50:01,052 [16] INFO - feed job http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=1 starting 2018-11-21 20:50:01,053 [16] INFO - do task -> request address http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=1 2018-11-21 20:50:04,157 [5] INFO - request http://www.ruijihg.com/爬虫 response code is OK 2018-11-21 20:50:04,170 [5] INFO - http://www.ruijihg.com/爬虫 response save to /Users/user1/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/1_636784302041708630.json 2018-11-21 20:50:04,262 [19] INFO - feed job http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=2 starting 2018-11-21 20:50:04,263 [19] INFO - do task -> request address http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=2 2018-11-21 20:50:04,461 [9] INFO - request https://www.oschina.net/blog response code is OK 2018-11-21 20:50:04,463 [9] INFO - https://www.oschina.net/blog response save to /Users/user/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/5_636784302044636140.json 2018-11-21 20:50:04,832 [15] INFO - request https://www.kuaidaili.com/free/inha/1/ response code is OK 2018-11-21 20:50:04,833 [15] INFO - https://www.kuaidaili.com/free/inha/1/ response save to /Users/user1/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/2_636784302048338500.json 2018-11-21 20:50:04,864 [16] INFO - request http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=1 response code is OK 2018-11-21 20:50:04,865 [16] INFO - http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=2 response save to /Users/user1/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/3_636784302048652700.json 2018-11-21 20:50:04,946 [19] INFO - request http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=2 response code is OK 2018-11-21 20:50:04,947 [19] INFO - http://app.cannews.com.cn/roll.php?do=query&callback=jsonp1475197217819&=1542804600157&date=2018-11-21&size=20&page=2 response save to /Users/user1/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/3_636784302049472390.json 2018-11-21 20:50:06,134 [13] INFO - request http://press.gapp.gov.cn:8088/press_search/pages/query/queryAction!findmediaPaging.action response code is OK 2018-11-21 20:50:06,143 [13] INFO - http://press.gapp.gov.cn:8088/press_search/pages/query/queryAction!findmediaPaging.action response save to /Users/user1/git/RuiJi.Net/RuiJi.Net.Cmd/bin/Debug/netcoreapp2.1/snapshot/11_636784302061430920.json
2018-11-21 20:51:07,090 [13] INFO - extract job http://www.cannews.com.cn/2018/1121/185471.shtml save result False 2018-11-21 20:51:07,090 [16] INFO - extract job http://www.cannews.com.cn/2018/1121/185469.shtml save result False
....
2018-11-21 20:52:00,005 [13] INFO - feed extract job execute 2018-11-21 20:52:00,006 [13] INFO - extract job started 2018-11-21 20:52:00,006 [13] INFO - begin move delay feed 2018-11-21 20:52:00,007 [9] INFO - get snapshot feed count:0 2018-11-21 20:53:00,004 [27] INFO - feed extract job execute 2018-11-21 20:53:00,005 [27] INFO - extract job started 2018-11-21 20:53:00,005 [27] INFO - begin move delay feed 2018-11-21 20:53:00,005 [13] INFO - get snapshot feed count:0
这个是 Error Log,没有堆栈信息,应从何处查起呢?
2018-11-21 19:45:00,037 [20] ERROR - https://www.oschina.net/blog response error is Specified value has invalid Control characters. Parameter name: value 2018-11-21 19:45:00,037 [17] ERROR - http://www.ruijihg.com/爬虫 response error is Specified value has invalid Control characters. Parameter name: value 2018-11-21 20:22:16,762 [39] ERROR - http://www.cannews.com.cn/2018/1121/185448.shtml response error is A task may only be disposed if it is in a completion state (RanToCompletion, Faulted or Canceled). 2018-11-21 20:32:15,484 [43] ERROR - http://www.cannews.com.cn/2018/1121/185460.shtml response error is A task may only be disposed if it is in a completion state (RanToCompletion, Faulted or Canceled). 2018-11-21 20:37:33,384 [15] ERROR - http://www.jiuxian.com/goods-55611.html?source=92 response error is One or more errors occurred. (Failed to launch chrome! path to executable does not exist) 2018-11-21 20:45:00,070 [22] ERROR - http://www.jiuxian.com/goods-55611.html?source=92 response error is One or more errors occurred. (Failed to launch chrome! path to executable does not exist) 2018-11-21 20:50:00,166 [4] ERROR - http://www.jiuxian.com/goods-55611.html?source=92 response error is One or more errors occurred. (Failed to launch chrome! path to executable does not exist) 2018-11-21 20:55:00,020 [28] ERROR - http://www.jiuxian.com/goods-55611.html?source=92 response error is One or more errors occurred. (Failed to launch chrome! path to executable does not exist) 2018-11-21 21:00:00,215 [24] ERROR - http://www.jiuxian.com/goods-55611.html?source=92 response error is One or more errors occurred. (Failed to launch chrome! path to executable does not exist)
你好,macOS因设备原因无法测试。“Failed to launch chrome! path to executable does not exist”此错误是该规则使用了RunJs但是没有配置好无头浏览器。
如果您需要运行页面上的js脚本,您需要安装chromium无头浏览器。 地址为 https://pan.baidu.com/s/1rsyCNnXxbobCBLZuPTiJHQ 访问密码 cr3e 下载RuiJi.Net所部署的操作系统对应的chromium的zip包 将运行文件解压至RuiJi.Net运行根目录中的chromium文件夹中,即可运行RunJs。
具体macOS使用chromium还要如何还要如何配置,请查阅一下相关资料。 以下为linux解决方法。 linux下需安装chromelib库 yum install chromium-libs.x86_64 并给与chromium文件夹最高权限 chmod -R 777 chromium
进行以上两步之后linux即可正常运行chromium无头浏览器,供参考。 https://gitee.com/zhupingqi/RuiJi.Net/wikis/%E5%85%B6%E4%BB%96?sort_id=580719 参考中文文档
你好,macOS因设备原因无法测试。“Failed to launch chrome! path to executable does not exist”此错误是该规则使用了RunJs但是没有配置好无头浏览器。
如果您需要运行页面上的js脚本,您需要安装chromium无头浏览器。 地址为 https://pan.baidu.com/s/1rsyCNnXxbobCBLZuPTiJHQ 访问密码 cr3e 下载RuiJi.Net所部署的操作系统对应的chromium的zip包 将运行文件解压至RuiJi.Net运行根目录中的chromium文件夹中,即可运行RunJs。
具体macOS使用chromium还要如何还要如何配置,请查阅一下相关资料。 以下为linux解决方法。 linux下需安装chromelib库 yum install chromium-libs.x86_64 并给与chromium文件夹最高权限 chmod -R 777 chromium
进行以上两步之后linux即可正常运行chromium无头浏览器,供参考。 https://gitee.com/zhupingqi/RuiJi.Net/wikis/%E5%85%B6%E4%BB%96?sort_id=580719 参考中文文档
感谢回复!使用的项目中自带的数据做的测试,里面也有无需 RunJs 的项目,也没有 GrabResult 。
所以对于:
2018-11-21 20:22:16,762 [39] ERROR - http://www.cannews.com.cn/2018/1121/185448.shtml response error is A task may only be disposed if it is in a completion state (RanToCompletion, Faulted or
这类错误应从何查起呢?
你好,macOS因设备原因无法测试。“Failed to launch chrome! path to executable does not exist”此错误是该规则使用了RunJs但是没有配置好无头浏览器。 如果您需要运行页面上的js脚本,您需要安装chromium无头浏览器。 地址为 https://pan.baidu.com/s/1rsyCNnXxbobCBLZuPTiJHQ 访问密码 cr3e 下载RuiJi.Net所部署的操作系统对应的chromium的zip包 将运行文件解压至RuiJi.Net运行根目录中的chromium文件夹中,即可运行RunJs。 具体macOS使用chromium还要如何还要如何配置,请查阅一下相关资料。 以下为linux解决方法。 linux下需安装chromelib库 yum install chromium-libs.x86_64 并给与chromium文件夹最高权限 chmod -R 777 chromium 进行以上两步之后linux即可正常运行chromium无头浏览器,供参考。 https://gitee.com/zhupingqi/RuiJi.Net/wikis/%E5%85%B6%E4%BB%96?sort_id=580719 参考中文文档
感谢回复!使用的项目中自带的数据做的测试,里面也有无需 RunJs 的项目,也没有 GrabResult 。
所以对于:
2018-11-21 20:22:16,762 [39] ERROR - http://www.cannews.com.cn/2018/1121/185448.shtml response error is A task may only be disposed if it is in a completion state (RanToCompletion, Faulted or
这类错误应从何查起呢?
你好,此条日志提示响应异常,打开此链接发现已经失效。 请检查需要提取的Feed及Rule是否设置正确。 请参照测试服务器FeedId为5的开源中国博客示例。 http://118.31.61.230:36000/#feed/feeds