pyspider
pyspider copied to clipboard
not implemented
What is it? How is it fixed?
self.call() not implemented! [E 150611 13:35:41 base_handler:194] self.call() not implemented! Traceback (most recent call last): File "/home/_/pyspider-package/pyspider/libs/base_handler.py", line 187, in run_task result = self.run_task(task, response) File "/home//pyspider-package/pyspider/libs/base_handler.py", line 159, in _run_task raise NotImplementedError("self.%s() not implemented!" % callback) NotImplementedError: self.call() not implemented!
When no callback specified in self.crawl
, __call__
is used and not implemented by default.
I have argument callback "parse_page": self.crawl(url=this_url,callback=self.parse_page) I don't understand, but when task retries, then status good or if i go to debug, I see task success..
Where did you find this log? In debug page or task status page? The script using in processor is not the same as it the second time, I guess. Maybe some processor instance is out of something...
Yes, I find this on page '/tasks?project=' in web UI. Next instance is success.
Could you please paste the schedule and process section of the task?
There is a callback
in process section... Have no idea...
is this fixed ? I also have this problem. It appears randomly.
bug is not fixed.
Could you reproduce this issue in demo.pyspider.org?
@binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?
@robinxb You can try to reproduce it without save.
On Thu, Sep 24, 2015, 11:03 AM robinxb [email protected] wrote:
@binux https://github.com/binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?
— Reply to this email directly or view it on GitHub https://github.com/binux/pyspider/issues/224#issuecomment-142880067.
@binux It must be saved and run for a while to let this error occur.
@robinxb is the issue related to certain website or could you rewrite it for some other website.
@binux This website with version 0.3.5 pyspider.
@robinxb could you send it to me?
well, good catch...
The issue goes to how the process
fields get lost....
Hello, sometimes it is very annoing. Tried everything, flushing taskdb, restarting scheduler, but hiting same issue again and again. After sometime it dissapeared by itself. It is like it is not having resources which it is need, something got deleted earlier or something like that... Can you try fix this?
@volvofixthis it looks like happen randomly, I don't know what cause the issue. If any step can reproduce would be the best. Or, add some capture points may help.
Just add things like
if not task.get('process'):
print('no-process-error', task)
in
https://github.com/binux/pyspider/blob/master/pyspider/scheduler/scheduler.py#L979 https://github.com/binux/pyspider/blob/master/pyspider/fetcher/tornado_fetcher.py#L725 https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103
collect log, grep no-process-error
to see which component causing the problem.
now ,the bug haved fixed?
now i also met the bug.when mang request join to the queue ,the bug can be occour
Same as above Is this BUG fixed now?
The error occurred at the following address: https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103
What to do next? @binux
meet the same , is it fixed?
encounter same problem.
reason:
self.crawl
must passed valid callback
parameter, otherwise will error
raise NotImplementedError("self.%s() not implemented!" % callback)
NotImplementedError: self.__call__() not implemented!
solution: change from
def on_start(self):
self.crawl("http://www.baidu.com")
to:
def on_start(self):
self.crawl("http://www.baidu.com", callback=self.getBaiduCallback)
def getBaiduCallback(self, response):
respUrl = response.url
print("respUrl=%s" % respUrl)
print("response=%s" % response)