pyspider not implemented

not implemented

Open vortex14 opened this issue 9 years ago • 26 comments

What is it? How is it fixed?

self.call() not implemented! [E 150611 13:35:41 base_handler:194] self.call() not implemented! Traceback (most recent call last): File "/home/_/pyspider-package/pyspider/libs/base_handler.py", line 187, in run_task result = self.run_task(task, response) File "/home//pyspider-package/pyspider/libs/base_handler.py", line 159, in _run_task raise NotImplementedError("self.%s() not implemented!" % callback) NotImplementedError: self.call() not implemented!

Jun 11 '15 13:06 vortex14

When no callback specified in self.crawl, __call__ is used and not implemented by default.

Jun 11 '15 13:06 binux

I have argument callback "parse_page": self.crawl(url=this_url,callback=self.parse_page) I don't understand, but when task retries, then status good or if i go to debug, I see task success..

Jun 11 '15 19:06 vortex14

Where did you find this log? In debug page or task status page? The script using in processor is not the same as it the second time, I guess. Maybe some processor instance is out of something...

Jun 12 '15 02:06 binux

error Yes, I find this on page '/tasks?project=' in web UI. Next instance is success.

Jun 15 '15 11:06 vortex14

Could you please paste the schedule and process section of the task?

Jun 15 '15 12:06 binux

eroor2

Jun 15 '15 12:06 vortex14

There is a callback in process section... Have no idea...

Jun 15 '15 13:06 binux

is this fixed ? I also have this problem. It appears randomly.

Sep 15 '15 11:09 robinxb

bug is not fixed.

Sep 22 '15 15:09 vortex14

Could you reproduce this issue in demo.pyspider.org?

Sep 23 '15 20:09 binux

@binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?

Sep 24 '15 10:09 robinxb

@robinxb You can try to reproduce it without save.

On Thu, Sep 24, 2015, 11:03 AM robinxb [email protected] wrote:

@binux https://github.com/binux I have the issue script, but I'm not willing to public it. Is there any way except reproducing on demo.pyspider.org?

— Reply to this email directly or view it on GitHub https://github.com/binux/pyspider/issues/224#issuecomment-142880067.

Sep 24 '15 10:09 binux

@binux It must be saved and run for a while to let this error occur.

Sep 24 '15 11:09 robinxb

@robinxb is the issue related to certain website or could you rewrite it for some other website.

Sep 24 '15 11:09 binux

@binux This website with version 0.3.5 pyspider.

Sep 24 '15 11:09 robinxb

@robinxb could you send it to me?

Sep 24 '15 19:09 binux

error

Nov 30 '15 16:11 vortex14

well, good catch... The issue goes to how the process fields get lost....

Nov 30 '15 16:11 binux

Hello, sometimes it is very annoing. Tried everything, flushing taskdb, restarting scheduler, but hiting same issue again and again. After sometime it dissapeared by itself. It is like it is not having resources which it is need, something got deleted earlier or something like that... Can you try fix this?

Jun 28 '17 05:06 volvofixthis

@volvofixthis it looks like happen randomly, I don't know what cause the issue. If any step can reproduce would be the best. Or, add some capture points may help.

Just add things like

if not task.get('process'):
    print('no-process-error', task)

https://github.com/binux/pyspider/blob/master/pyspider/scheduler/scheduler.py#L979 https://github.com/binux/pyspider/blob/master/pyspider/fetcher/tornado_fetcher.py#L725 https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103

collect log, grep no-process-error to see which component causing the problem.

Jul 01 '17 23:07 binux

now ,the bug haved fixed?

Aug 01 '17 03:08 tygzx

now i also met the bug.when mang request join to the queue ,the bug can be occour

Aug 01 '17 03:08 tygzx

Same as above Is this BUG fixed now?

Jul 25 '18 02:07 mengnan254

The error occurred at the following address： https://github.com/binux/pyspider/blob/master/pyspider/processor/processor.py#L103

What to do next？ @binux

Jul 25 '18 09:07 mengnan254

meet the same , is it fixed?

Oct 08 '18 06:10 chiwah-keen

encounter same problem.

reason: self.crawl must passed valid callback parameter, otherwise will error

        raise NotImplementedError("self.%s() not implemented!" % callback)
    NotImplementedError: self.__call__() not implemented!

solution: change from

    def on_start(self):
        self.crawl("http://www.baidu.com")

to:

    def on_start(self):
        self.crawl("http://www.baidu.com", callback=self.getBaiduCallback)

    def getBaiduCallback(self, response):
        respUrl = response.url
        print("respUrl=%s" % respUrl)
        print("response=%s" % response)

Mar 29 '19 03:03 crifan

pyspider pyspider copied to clipboard

not implemented

pyspider
pyspider copied to clipboard