cmdbac
cmdbac copied to clipboard
Bump scrapy from 1.0.3 to 1.8.1
Bumps scrapy from 1.0.3 to 1.8.1.
Release notes
Sourced from scrapy's releases.
1.8.1
Security bug fix:
If you use
HttpAuthMiddleware(i.e. thehttp_userandhttp_passspider attributes) for HTTP authentication, any request exposes your credentials to the request target.To prevent unintended exposure of authentication credentials to unintended domains, you must now additionally set a new, additional spider attribute,
http_auth_domain, and point it to the specific domain to which the authentication credentials must be sent.If the
http_auth_domainspider attribute is not set, the domain of the first request will be considered the HTTP authentication target, and authentication credentials will only be sent in requests targeting that domain.If you need to send the same HTTP authentication credentials to multiple domains, you can use
w3lib.http.basic_auth_headerinstead to set the value of theAuthorizationheader of your requests.If you really want your spider to send the same HTTP authentication credentials to any domain, set the
http_auth_domainspider attribute toNone.Finally, if you are a user of scrapy-splash, know that this version of Scrapy breaks compatibility with scrapy-splash 0.7.2 and earlier. You will need to upgrade scrapy-splash to a greater version for it to continue to work.
1.7.4
Revert the fix for #3804 (#3819), which has a few undesired side effects (#3897, #3976).
1.7.3
Enforce lxml 4.3.5 or lower for Python 3.4 (#3912, #3918)
1.7.2
Fix Python 2 support (#3889, #3893, #3896)
1.7.0
Highlights:
- Improvements for crawls targeting multiple domains
- A cleaner way to pass arguments to callbacks
- A new class for JSON requests
- Improvements for rule-based spiders
- New features for feed exports
1.6.0
Highlights:
- Better Windows support
- Python 3.7 compatibility
- Big documentation improvements, including a switch from .extract_first() + .extract() API to .get() + .getall() API
- Feed exports, FilePipeline and MediaPipeline improvements
- Better extensibility: item_error and request_reached_downloader signals; from_crawler support for feed exporters, feed storages and dupefilters.
- scrapy.contracts fixes and new features
- Telnet console security improvements, first released as a backport in Scrapy 1.5.2 (2019-01-22)
- Clean-up of the deprecated code
- Various bug fixes, small new features and usability improvements across the codebase.
... (truncated)
Changelog
Sourced from scrapy's changelog.
Scrapy 1.8.1 (2021-10-05)
Security bug fix:
If you use :class:
~scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware(i.e. thehttp_userandhttp_passspider attributes) for HTTP authentication, any request exposes your credentials to the request target.To prevent unintended exposure of authentication credentials to unintended domains, you must now additionally set a new, additional spider attribute,
http_auth_domain, and point it to the specific domain to which the authentication credentials must be sent.If the
http_auth_domainspider attribute is not set, the domain of the first request will be considered the HTTP authentication target, and authentication credentials will only be sent in requests targeting that domain.If you need to send the same HTTP authentication credentials to multiple domains, you can use :func:
w3lib.http.basic_auth_headerinstead to set the value of theAuthorizationheader of your requests.If you really want your spider to send the same HTTP authentication credentials to any domain, set the
http_auth_domainspider attribute toNone.Finally, if you are a user of
scrapy-splash_, know that this version of Scrapy breaks compatibility with scrapy-splash 0.7.2 and earlier. You will need to upgrade scrapy-splash to a greater version for it to continue to work... _scrapy-splash: https://github.com/scrapy-plugins/scrapy-splash
.. _release-1.8.0:
Scrapy 1.8.0 (2019-10-28)
Highlights:
- Dropped Python 3.4 support and updated minimum requirements; made Python 3.8 support official
- New :meth:
Request.from_curl <scrapy.http.Request.from_curl>class method- New :setting:
ROBOTSTXT_PARSERand :setting:ROBOTSTXT_USER_AGENTsettings- New :setting:
DOWNLOADER_CLIENT_TLS_CIPHERSand :setting:DOWNLOADER_CLIENT_TLS_VERBOSE_LOGGINGsettings
... (truncated)
Commits
283e90eBump version: 1.8.0 → 1.8.199ac4dbCover 1.8.1 in the release notes1635134Small documentation fixes.b01d69aAdd http_auth_domain to HttpAuthMiddleware.4183925Travis CI → GitHub Actionsbe2e910Bump version: 1.7.0 → 1.8.094f060fCover Scrapy 1.8.0 in the release notes (#3952)18b808bMerge pull request #4092 from further-reading/master93e3dc1[test_downloadermiddleware_httpcache.py] Cleaning textb73d217[test_downloadermiddleware_httpcache.py] Fixing pytest mark behaviour- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)@dependabot use these labelswill set the current labels as the default for future PRs for this repo and language@dependabot use these reviewerswill set the current reviewers as the default for future PRs for this repo and language@dependabot use these assigneeswill set the current assignees as the default for future PRs for this repo and language@dependabot use this milestonewill set the current milestone as the default for future PRs for this repo and language
You can disable automated security fix PRs for this repo from the Security Alerts page.