lemon24

Results 63 comments of lemon24

Progress report: In [revision 5](https://gist.github.com/lemon24/10ae478fafb8fc1cb091f04e0ceec03f/477f0f8150c23877cf4ab60124255f0658b2483a) of the prototype, I extracted the "join a prefix to a file object" logic into a separate class (with a few tests); this is useful...

@tadeoos, parse() in the gist from my previous comment works correctly for any seek-able() files (that is, to parse an URL you have to download the feed yourself and pass...

@kurtmckee, I want to continue working on the other two issues when I have some time. Do you agree with the solutions proposed in [my first comment](https://github.com/kurtmckee/feedparser/issues/296#issue-1083936276)? The first one,...

Overall, I am +1 on not using an XML parser (if it's not an overwhelming amount of work to do). From what I understand, most issues with XML come from...

While writing the comment above, I realized there may be a number of opportunities presented by a reworking of the parsing backend. I understand you're operating under a number of...

Hi, coming here from your comment on #302. I ran a few tests where I called feedparser.parse() in a loop and measured memory usage (details below). I tried two feeds,...

I have two, adding them in a single comment since they are related. They don't look very pretty, but I think I managed to make the code quite reusable /...

This is one way of doing it: ```python feedparser.api.PREFERRED_XML_PARSERS.insert(0, 'defusedxml.expatreader') ``` Note that it's global; maybe we can make a full copy of the feedparser.api module at runtime, to avoid...

There's one of my feeds that fails when I try the above. ``` unexpected error while reading feed: 'http://www.xn--8ws00zhy3a.com/feed': defusedxml.common.EntitiesForbidden: EntitiesForbidden(name='xhtml', system_id=None, public_id=None) ``` The feed looks like this (note...

FWIW, using lxml may be good enough: https://pypi.org/project/defusedxml/#python-xml-libraries (most of the vulnerabilities are marked with False, and we may not care about those marked with True; needs looking into). Here's...