python-goose Goose is non-functional in Python 3

Title is largely self explanatory.

Primary limitation seems to be reliance on BeautifulSoup 3, which has been EOL for quite a while now, and really should be migrated away from.

Sep 16 '14 05:09 fake-name

~~Actually, where is beautifulsoup used at all? I can't find any reference in the codebase to it at all~~ It's being used in lxml somewhere, somehow, despite no explicit mention of it anywhere.

Also, unittest sucks, and doesn't report anything informative when you have an importerror. You can apparently use nosetests to run the same tests with sane output.

jieba can be replaced with jieba3k.

Sep 16 '14 06:09 fake-name

Going through everything, it appears that the heavy dependency on soupparser is a problem. Runtime patching in bs4 instead of bs3 is not workable, since lxml uses invalid arguments to __init__.

Sep 16 '14 06:09 fake-name

I have unit tests working.

Ran 126 tests in 10.607s

FAILED (errors=54, failures=49)

Welp! Time to look at other text extractors.

Is there any timeline on python 3 compatibility?

Sep 16 '14 06:09 fake-name

+1 for python 3 support... Is there any schedule? Or you don't care at all?

Nov 25 '14 18:11 hnykda

@kotrfa - It's not a direct equivalent, but I wound up using python-readability for text extraction. It works well enough.

Nov 25 '14 19:11 fake-name

Prepare PR to add py3 support: https://github.com/grangier/python-goose/pull/220

Apr 09 '15 09:04 vetal4444

+1 for this. Why uses Python 2!?

Jul 13 '15 06:07 xanderdunn

Still waitting for Python 3 support :)

Jun 07 '16 12:06 hipoglucido

I believe this project is dead. Use https://github.com/codelucas/newspaper instead, which is inspired by goose and supports Python 3 flawlessly.

Jun 07 '16 13:06 hnykda

Yep, I already knew it but I just wanted to do some comparison of the available tools. Indeed, I will use it. Thanks!

Jun 07 '16 13:06 hipoglucido

Any plans to introduce Python 3 support to this project?

Sep 05 '16 11:09 LukeB42

Any plans to introduce Python 3 support to this project?

Sep 05 '16 11:09 LukeB42

Hi everyone, this may come off as self promotion, but I went ahead and forked goose to work with python3. http://github.com/goose3/goose3 Enjoy

Apr 20 '17 19:04 lababidi