pyspider icon indicating copy to clipboard operation
pyspider copied to clipboard

not implemented

Open vortex14 opened this issue 9 years ago • 26 comments

What is it? How is it fixed?

self.call() not implemented! [E 150611 13:35:41 base_handler:194] self.call() not implemented! Traceback (most recent call last): File "/home/_/pyspider-package/pyspider/libs/base_handler.py", line 187, in run_task result = self.run_task(task, response) File "/home//pyspider-package/pyspider/libs/base_handler.py", line 159, in _run_task raise NotImplementedError("self.%s() not implemented!" % callback) NotImplementedError: self.call() not implemented!

vortex14 avatar Jun 11 '15 13:06 vortex14

When no callback specified in self.crawl, __call__ is used and not implemented by default.

binux avatar Jun 11 '15 13:06 binux

I have argument callback "parse_page": self.crawl(url=this_url,callback=self.parse_page) I don't understand, but when task retries, then status good or if i go to debug, I see task success..

vortex14 avatar Jun 11 '15 19:06 vortex14

Where did you find this log? In debug page or task status page? The script using in processor is not the same as it the second time, I guess. Maybe some processor instance is out of something...

binux avatar Jun 12 '15 02:06 binux

error Yes, I find this on page '/tasks?project=' in web UI. Next instance is success.

vortex14 avatar Jun 15 '15 11:06 vortex14

Could you please paste the schedule and process section of the task?

binux avatar Jun 15 '15 12:06 binux

eroor2

vortex14 avatar Jun 15 '15 12:06 vortex14

There is a callback in process section... Have no idea...

binux avatar Jun 15 '15 13:06 binux

is this fixed ? I also have this problem. It appears randomly.

robinxb avatar Sep 15 '15 11:09 robinxb

bug is not fixed.

vortex14 avatar Sep 22 '15 15:09 vortex14

Could you reproduce this issue in demo.pyspider.org?

binux avatar Sep 23 '15 20:09 binux

@binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?

robinxb avatar Sep 24 '15 10:09 robinxb

@robinxb You can try to reproduce it without save.

On Thu, Sep 24, 2015, 11:03 AM robinxb [email protected] wrote:

@binux https://github.com/binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?

— Reply to this email directly or view it on GitHub https://github.com/binux/pyspider/issues/224#issuecomment-142880067.

binux avatar Sep 24 '15 10:09 binux

@binux It must be saved and run for a while to let this error occur.

robinxb avatar Sep 24 '15 11:09 robinxb

@robinxb is the issue related to certain website or could you rewrite it for some other website.

binux avatar Sep 24 '15 11:09 binux

@binux This website with version 0.3.5 pyspider.

robinxb avatar Sep 24 '15 11:09 robinxb

@robinxb could you send it to me?

binux avatar Sep 24 '15 19:09 binux

error

vortex14 avatar Nov 30 '15 16:11 vortex14

well, good catch... The issue goes to how the process fields get lost....

binux avatar Nov 30 '15 16:11 binux

Hello, sometimes it is very annoing. Tried everything, flushing taskdb, restarting scheduler, but hiting same issue again and again. After sometime it dissapeared by itself. It is like it is not having resources which it is need, something got deleted earlier or something like that... Can you try fix this?

volvofixthis avatar Jun 28 '17 05:06 volvofixthis

@volvofixthis it looks like happen randomly, I don't know what cause the issue. If any step can reproduce would be the best. Or, add some capture points may help.

Just add things like

if not task.get('process'):
    print('no-process-error', task)

in

https://github.com/binux/pyspider/blob/master/pyspider/scheduler/scheduler.py#L979 https://github.com/binux/pyspider/blob/master/pyspider/fetcher/tornado_fetcher.py#L725 https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103

collect log, grep no-process-error to see which component causing the problem.

binux avatar Jul 01 '17 23:07 binux

now ,the bug haved fixed?

tygzx avatar Aug 01 '17 03:08 tygzx

now i also met the bug.when mang request join to the queue ,the bug can be occour

tygzx avatar Aug 01 '17 03:08 tygzx

Same as above Is this BUG fixed now?

mengnan254 avatar Jul 25 '18 02:07 mengnan254

The error occurred at the following address: https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103

What to do next? @binux

mengnan254 avatar Jul 25 '18 09:07 mengnan254

meet the same , is it fixed? image

chiwah-keen avatar Oct 08 '18 06:10 chiwah-keen

encounter same problem.

reason: self.crawl must passed valid callback parameter, otherwise will error

        raise NotImplementedError("self.%s() not implemented!" % callback)
    NotImplementedError: self.__call__() not implemented!

solution: change from

    def on_start(self):
        self.crawl("http://www.baidu.com")

to:

    def on_start(self):
        self.crawl("http://www.baidu.com", callback=self.getBaiduCallback)

    def getBaiduCallback(self, response):
        respUrl = response.url
        print("respUrl=%s" % respUrl)
        print("response=%s" % response)

crifan avatar Mar 29 '19 03:03 crifan