changedetection.io
changedetection.io copied to clipboard
HTTP "POST" request with UTF-8 non latin [feature]
I'm trying to post a request with UTF-8 chars failing because latin-1 is used couldn't find where to change it
'latin-1' codec can't encode characters in position 57-63: Body ('בדיקה') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
can you paste the full HTTP request settings here?
Sure
Url : https://www.bezeq.co.il/umbraco/api/FormWebApi/CheckAddress
Method: POST
Data: {"CityId":"1111","StreetId":"1111","House":"11111","Street":"בדיקה","City":"בדיקה","Entrance":""}
thanks, I can confirm this one.
I'm having the same issue here.
$ docker exec -it changedetection_io_app_1 bash
$ python3 -c "import requests; r = requests.post('http://httpbin.org/post', data='你好')"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/requests/api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
File "/usr/local/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/local/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/usr/local/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/usr/local/urllib3/connectionpool.py", line 416, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/urllib3/connection.py", line 244, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/local/lib/python3.10/http/client.py", line 1283, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.10/http/client.py", line 1328, in _send_request
body = _encode(body, 'body')
File "/usr/local/lib/python3.10/http/client.py", line 166, in _encode
raise UnicodeEncodeError(
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-1: Body ('你好') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
https://stackoverflow.com/questions/55887958/what-is-the-default-encoding-when-python-requests-post-data-is-string-type/56120372#56120372
--
If body is a string, it is encoded as ISO-8859-1, the default for HTTP. If it is a bytes-like object, the bytes are sent as is. If it is a file object, the contents of the file is sent; this file object should support at least the read() method.
ISO-8859-1 is well known as latin-1
.
https://docs.python.org/3/library/http.client.html#http.client.HTTPConnection.request
Possible solution
https://github.com/dgtlmoon/changedetection.io/blob/0.45.24/changedetectionio/content_fetchers/requests.py#L49
r = requests.request(method=request_method,
- data=request_body,
+ data=request_body.encode('utf-8') if type(request_body) is str else request_body,
url=url,
headers=request_headers,
timeout=timeout,
proxies=proxies,
verify=False)
@leiless isnt this a duplicate of https://github.com/dgtlmoon/changedetection.io/issues/2309 ?
If you are using JSON for your posts:// - Make sure you are using
| tojson
when building your json message, this should encode anything non-ascii and bypass this error. For example, it will turn the smiley ツ into \u30c4
@dgtlmoon No, it's not, I'm using the Basic fast Plaintext/HTTP Client POST with body (Chinese chars encoded in UTF-8).
https://github.com/dgtlmoon/changedetection.io/issues/2309 is all about deliver notification with UTF-8 chars.