wishlist icon indicating copy to clipboard operation
wishlist copied to clipboard

Wishlist CLI not returning data anymore because of TypeError

Open tym-project opened this issue 4 years ago • 9 comments

Used to work properly in a script, but has not been returning any data for a few months.

Running 0.2.2, command line fails with : wishlist dump 3L5SXXXXXBZ6K Traceback (most recent call last): File "/usr/local/bin/wishlist", line 8, in sys.exit(console()) TypeError: exit() missing 1 required positional argument: 'mod_name'

Upgraded to 0.3.1, same issue.

tym-project avatar Mar 02 '20 21:03 tym-project

I think this is actually a problem with the older wishlist CLI using a newer captain (specifically captain 3.0.0+), you should be able to fix it by downgrading captain:

$ pip install captain==2.0.4

Alternatively, it would probably work with the latest version of captain if you modified sys.exit(console()) to look like this:

sys.exit(console(__name__))

The actual parser is still working because I use it every day. This is a legit problem with the CLI but I don't have time to fix it right now, so I would suggest one of the above remedies, if you modify the script and it works I'd love a pull request :)

Jaymon avatar Mar 06 '20 01:03 Jaymon

Thanks for the info, I'll try to fix the CLI... but my issue is actually with a script (no specific error, but no data is returned by "Whishlist()").

I should have specified earlier that I'm running it against amazon.fr. I'll try with an amazon.com wishlist to see if it's an issue with the fr version of the site.

tym-project avatar Mar 06 '20 06:03 tym-project

No luck adding "name"... Traceback (most recent call last): File "/usr/local/bin/wishlist", line 8, in <module> sys.exit(console(__name__)) File "/usr/local/lib/python3.5/dist-packages/captain/__init__.py", line 38, in exit s = Script(inspect.getfile(mod), module=mod) File "/usr/local/lib/python3.5/dist-packages/captain/__init__.py", line 145, in __init__ self.parse() File "/usr/local/lib/python3.5/dist-packages/captain/__init__.py", line 260, in parse raise ParseError("no main function found") captain.exception.ParseError: no main function found

No luck downgrading captain either.

tym-project avatar Mar 06 '20 20:03 tym-project

So I just pushed Wishlist 0.4.0 that works with Captain 3.0.0. I tested on my personal wishlist and it worked:

$ python -m wishlist dump <HASH>
1. Clean Code: A Handbook of Agile Software Craftsmanship (Robert C. Martin Series) is $29.44
2. How to Read a Book (A Touchstone Book) is $13.99
3. Hynes Eagle 38L Flight Approved Weekender Carry on Backpack is $49.99
...

But my wishlist is from amazon.com and since it's not easy for me to test an amazon.fr wishlist you're probably on your own for figuring out that issue, I'll help however I can though.

I've tried to make wishlist.core.WishlistElement very forgiving of parse errors but I'm always chasing Amazon's changes

Jaymon avatar Mar 06 '20 22:03 Jaymon

Thanks, I've updated to wishlist 4.0.0 (brow 0.0.3 and captain 3.0.0), I have the same issue when I run $ wishlist dump 3U1HZP3ZCY3XA (this is an amazon.com public ID from some US non-profit) :

Traceback (most recent call last):
File "/usr/local/bin/wishlist", line 8, in <module>
   sys.exit(console())
TypeError: exit() missing 1 required positional argument: 'mod_name'

Running it with your syntax returns nothing (might be an env var issue, but same thing on an amazon.fr ID) :

$ python3 -m wishlist dump 3U1HZP3ZCY3XA
Done with wishlist, 1 total items

If I try to run it from a script, no errors but no data either :

#!/usr/bin/python3.5

import os
from wishlist.core import Wishlist

os.environ["WISHLIST_HOST"]="https://amazon.com/"
lists=[]
lists.append({'name':'Test','id':'3U1HZP3ZCY3XA'})

for list in lists:
    data = Wishlist(list['id'])
    for item in data:
        print(item)

tym-project avatar Mar 07 '20 09:03 tym-project

Got it ! It seems to be an issue with the BeautifulSoup parser. If I force lxml via env var HTML_PARSER, it works. It seems to be an issue where bs4 can't handle too many nested tags with some parsers (https://stackoverflow.com/a/14587348).

Side issue, in wishlist/core.py, I have to force the old (?) way of getting the env var, or else it does not work (if needed I can open a separate issue for this) :

[...]
class BaseAmazon(object):
    @property
    def host(self):
        #return environ.HOST
        return os.environ.get("WISHLIST_HOST", "https://www.amazon.com")
[...]

I think this is due to my script setting the env var after the import, as HOST is a variable in environ.py it might be set too early for that usecase?

tym-project avatar Mar 07 '20 15:03 tym-project

I think you've nailed the environment problem correctly, I'm not sure I would change it though, if you are changing it in running code I would just import wishlist.environ directly and modify environ.HOST there instead of setting the environment variable.

Has everything been working ok with python 3? I originally wrote it and still run it in python 2 primarily, I try and write cross version code but I didn't explicitly add python3 support in setup.py and I'm not sure if that was an oversight, or intentional, on my part because it was so long ago.

It also looks like there might be an issue with Brow because if you had installed lxml it should've auto-discovered that. n fact, that's how my current wishlist setup does it because it also seems to be using lxml to parse my wishlist.

Ugh, definitely room to make all this better, and make it easier to surface these issues to the user.

Jaymon avatar Mar 07 '20 22:03 Jaymon

Nice idea for environ.host, it's working... I need to up my python game :) This could be a nice trick to put in the doc maybe ?

#!/usr/bin/python3.5

from wishlist.core import Wishlist
from wishlist import environ

environ.HOST="https://www.amazon.fr"
lists=[]
lists.append({'name':'Music','id':'<HASH>'})

for list in lists:
    data = Wishlist(list['id'])
    for item in data:
        print(item.title)

Regarding lxml, you are also correct, if installed brow detects and uses it... my testing methodology was flawed in this regard. Could you maybe add lxml as a dependency to wishlist ? html.parser does not seem compatible with Amazon.X anymore (or maybe it's a side effect of python3 ?).

I have no issues with python3, but I'm thinking the CLI issue could be coming from that ? I'm not planning on doing further testing as I'm not using the CLI, but if you would like me to, don't hesitate.

To be honest your work is really awesome, your code very clean and nicely documented... sure you could improve some things, but after all you're offering it to the community for free...so...thanks and don't sweat it !

tym-project avatar Mar 08 '20 13:03 tym-project

I just added lxml as a dependency, I hadn't in the past because lxml was always a real pain to install and it had a default parser that I was able to use for the first little bit and so why add a hard to install dependency?

But those days might be over and I won't ever bother to fix the default parser myself since lxml is usually installed on my system.

I also updated the readme a bit with notes on how to manipulate the environment at runtime.

Thanks for all your help and I appreciate the kind words :)

Jaymon avatar Mar 10 '20 00:03 Jaymon