gitbook2pdf
gitbook2pdf copied to clipboard
Grab the contents of the gitbook document and convert it to pdf
Hi, Unfortunately, gitbook2pdf stopped working: it can't save, for example, `https://kubernetes.feisky.xyz` which is shown in documentation. **output:** ``` python3 gitbook.py http://kubernetes.feisky.xyz/ crawl : all done! Generating pdf,please wait patiently Fontconfig...
我用的是Windows 10,Python 3.9 64位。 首先,现在weasyprint可以直接通过pip 安装,但要正常运行,还要安装一个 GTK+库,具体见: [安装说明](https://doc.courtbouillon.org/weasyprint/stable/install.html#windows) ; [#721](https://github.com/Kozea/WeasyPrint/issues/721) ; 然后pip 安装requirements.txt 时,出现了错误,具体是这3个库:cffi,urllib3,requests, urllib3和requests这两个库大概是依赖问题,所以我干脆把这两个库升级到最新版,解决了依赖错误。 cffi这个库要求Microsoft Visual C++ 14.0,这错误没搞懂,我笔记本以前已经安装了visual studio 2019里的构建工具, 要重新配置Microsoft Visual C++ 14.0环境有点麻烦,不想折腾,所以干脆安装cffi的非官方二进制文件,从这里下载: [cffi](https://www.lfd.uci.edu/~gohlke/pythonlibs/#cffi) ; 我下载的是 cffi-1.14.6-cp39-cp39-win_amd64 ;...
我看到依赖最近升级了 urllib3 的版本,从 1.25.3 -> 1.26.5 但是在构建的时候, requests 2.22.0 不兼容 1.26x 提示内容: ``` #5 280.8 ERROR: Cannot install -r /app/requirements.txt (line 18) and urllib3==1.26.5 because these package versions have conflicting...
在抓取 https://hit-scir.gitbooks.io/neural-networks-and-deep-learning-zh_cn/content/ 这本书时,其它页面正常运作,但某页会出现错误并中断。 ``` Shell done : https://hit-scir.gitbooks.io/neural-networks-and-deep-learning-zh_cn/content/chap3/c3s0.html Traceback (most recent call last): File "gitbook.py", line 5, in Gitbook2PDF(url).run() File "/Users/cxjh168/Downloads/gitbook2pdf-master/gitbook2pdf/gitbook2pdf.py", line 198, in run loop.run_until_complete(self.crawl_main_content(content_urls)) File "/Users/cxjh168/anaconda3/lib/python3.7/asyncio/base_events.py", line 584,...
用的是 docker 镜像执行,字体是微软雅黑 左边是原始网站 http://shouce.jb51.net/kali-linux-tutorial/ ,右边是 pdf 效果。 问题:pdf 的文字行间距太小,密密麻麻挤在一起了。css 哪个地方可以改一下行间距呢? ![image](https://user-images.githubusercontent.com/16809751/119755699-01a78780-bed5-11eb-87d7-f27be21faef3.png)
Traceback (most recent call last): File "gitbook.py", line 5, in Gitbook2PDF(url).run() File "/mnt/c/Apps/Ubuntu/code/gitbook2pdf/gitbook2pdf/gitbook2pdf.py", line 198, in run loop.run_until_complete(self.crawl_main_content(content_urls)) File "/mnt/c/Apps/Ubuntu/apps/anaconda3/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/mnt/c/Apps/Ubuntu/code/gitbook2pdf/gitbook2pdf/gitbook2pdf.py", line 220,...
``` bash # ... done : https://book.flutterchina.club/chapter1/ done : https://book.flutterchina.club/chapter12/ios_implement.html done : https://book.flutterchina.club/chapter12/android_implement.html done : https://book.flutterchina.club/chapter10/ done : https://book.flutterchina.club/chapter4/stack.html done : https://book.flutterchina.club/chapter13/multi_languages_support.html Traceback (most recent call last): File "gitbook.py", line...
``` File "~/gitbook2pdf/gitbook2pdf/gitbook2pdf.py", line 105, in parser return html.unescape(ET.tostring(context).decode()) File "src/lxml/etree.pyx", line 3437, in lxml.etree.tostring File "src/lxml/serializer.pxi", line 139, in lxml.etree._tostring File "src/lxml/serializer.pxi", line 199, in lxml.etree._raiseSerialisationError lxml.etree.SerialisationError: IO_ENCODER ```
When I convert [The Go Programming Language](https://book.itsfun.top/gopl-zh/) into pdf, the output pdf file is truncated after section 5.2. The reason is that it uses `html.unescape()` to convert escape characters into...