scurl
scurl copied to clipboard
Performance-focused replacement for Python urllib
``` In [2]: from scurl import urlparse --------------------------------------------------------------------------- ImportError Traceback (most recent call last) in ----> 1 from scurl import urlparse ~/.virtualenvs/myenv/lib/python3.6/site-packages/scurl/__init__.py in 13 _original_urlparse = urlparse 14 ---> 15...
See https://github.com/scrapy/scurl/issues/58#issuecomment-513520254 and https://github.com/scrapy/scurl/issues/58#issuecomment-513583355 Also repeating here ``` Traceback (most recent call last): File "./bin/triage_links", line 34, in get_url_parts link = urljoin(record.url, record.href) File "scurl/cgurl.pyx", line 308, in scurl.cgurl.urljoin File...
I was following the install instructions from the README (macOS 10.14.5). There was one warning about ``` s3fs 0.2.1 has requirement six>=1.12.0, but you'll have six 1.11.0 which is incompatible....
This PR restores all the tests from urllib, since the current tests in Scurl are modified to pass the tests. However, we need Scurl to act similar to urllib 😄
Here are some components that can be implemented to further enhance the performance of canonicalize_url func: - [ ] parse_qsl_to_bytes, which includes unquote_to_bytes from stdlib - [ ] urlencode -...
Right now Scurl fails to run on Windows. We will need to come up with a way to support windows :) Details on this are coming soom
tox has been found failed on Mac OS on py34 and py35 env. Tox creates an environment that is Mac OS 10.6 for some reason and this is the traceback...
Right now, [this](https://github.com/nctl144/scurl/blob/master/vendor/gurl/url/url_canon_host.cc#L175) line is commented out. We should figure out a way to enable icu for this project :) The old chromium source of icu has this commented out...
Right now GURL could not handle such idna urls. All of the idna urls are marked as invalid. Although Google Chrome does parse these urls correctly! ``` >>> URL('банки.рф'.encode('idna')).is_valid() False...
We will need to work on integrating this library into Scrapy and W3lib. Make it an option for users to install it. Right now, we can prompt a message if...