gain
gain copied to clipboard
The ``sciencenet_spider.py`` example does not (seem to) work for python 3.6
I copied the examples/sciencenet_spider.py example and tried to run it using python 3.6 - but:
python sciencenet_spider.py
[2018:04:14 22:21:26] Spider started!
[2018:04:14 22:21:26] Using selector: KqueueSelector
[2018:04:14 22:21:26] Base url: http://blog.sciencenet.cn/
[2018:04:14 22:21:26] Item "Post": 0
[2018:04:14 22:21:26] Requests count: 0
[2018:04:14 22:21:26] Error count: 0
[2018:04:14 22:21:26] Time usage: 0:00:00.001127
[2018:04:14 22:21:26] Spider finished!
Traceback (most recent call last):
File "sciencenet_spider.py", line 19, in <module>
MySpider.run()
File "/Users/endafarrell/anaconda/anaconda3/lib/python3.6/site-packages/gain/spider.py", line 52, in run
loop.run_until_complete(cls.init_parse(semaphore))
File "/Users/endafarrell/anaconda/anaconda3/lib/python3.6/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "/Users/endafarrell/anaconda/anaconda3/lib/python3.6/site-packages/gain/spider.py", line 71, in init_parse
with aiohttp.ClientSession() as session:
File "/Users/endafarrell/anaconda/anaconda3/lib/python3.6/site-packages/aiohttp/client.py", line 746, in __enter__
raise TypeError("Use async with instead")
TypeError: Use async with instead
sys:1: RuntimeWarning: coroutine 'Parser.task' was never awaited
[2018:04:14 22:21:26] Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x105b07cf8>
My python is
python
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 12:04:33)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
and I have:
pip list | grep gain
gain 0.1.4
I installed gain using:
pip install gain
Any ideas?
Similar for python 3.5:
python sciencenet_spider.py
[2018:04:14 22:32:58] Spider started!
[2018:04:14 22:32:58] Using selector: KqueueSelector
[2018:04:14 22:32:58] Base url: http://blog.sciencenet.cn/
[2018:04:14 22:32:58] Item "Post": 0
[2018:04:14 22:32:58] Requests count: 0
[2018:04:14 22:32:58] Error count: 0
[2018:04:14 22:32:58] Time usage: 0:00:00.001171
[2018:04:14 22:32:58] Spider finished!
Traceback (most recent call last):
File "sciencenet_spider.py", line 19, in <module>
MySpider.run()
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/site-packages/gain/spider.py", line 52, in run
loop.run_until_complete(cls.init_parse(semaphore))
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/asyncio/base_events.py", line 467, in run_until_complete
return future.result()
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/asyncio/futures.py", line 294, in result
raise self._exception
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/asyncio/tasks.py", line 240, in _step
result = coro.send(None)
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/site-packages/gain/spider.py", line 71, in init_parse
with aiohttp.ClientSession() as session:
File "/Users/endafarrell/anaconda/anaconda3/envs/py35/lib/python3.5/site-packages/aiohttp/client.py", line 746, in __enter__
raise TypeError("Use async with instead")
TypeError: Use async with instead
sys:1: RuntimeWarning: coroutine 'Parser.task' was never awaited
[2018:04:14 22:32:58] Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x107259e48>
When python is:
python
Python 3.5.5 |Anaconda, Inc.| (default, Mar 12 2018, 16:25:05)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Check this line https://github.com/gaojiuli/gain/commit/2c8160c92943837613a773f681fb190a8c434bb2#diff-6b9bdc895398e257e454fa60948dba08R69 Just clone the latest code... It seems that author didn't release the latest version to pypi @gaojiuli
Hi @solarhell - many thanks. After I clone'd the latest code (and then added my local-subdir gain to the sys.path), the example works.
@gaojiuli - I'd love to know when pypi is updated!
same issue in python3.6
[2019:01:30 10:04:19] Spider started!
[2019:01:30 10:04:19] Base url: http://blog.sciencenet.cn/
[2019:01:30 10:04:19] Item "Post": 0
[2019:01:30 10:04:19] Requests count: 0
[2019:01:30 10:04:19] Error count: 0
[2019:01:30 10:04:19] Time usage: 0:00:00.000988
[2019:01:30 10:04:19] Spider finished!
Traceback (most recent call last):
File "sciencenet_spider.py", line 19, in
OS: Mac Darwin Kernel Version 18.2.0 Python Python 3.6.3 :: Anaconda custom (64-bit) install gain via pip install gain
AnyIdea ?
Just install it via pip install -U -e git+https://github.com/gaojiuli/gain.git#egg=gain