dokuwikixmlrpc
dokuwikixmlrpc copied to clipboard
Bug: 401 Unauthorized when using python 3.9
Let me use that simple code:
from dokuwikixmlrpc import DokuWikiClient
client = DokuWikiClient('https://wiki.example.net', 'remote-enabled-user', 'password')
print(client.page('home'))
With Python 3.7, i get the content of the home page, as expected. I recently updated my OS, and got python 3.9, on two different machines, i now got this stacktrace:
Traceback (most recent call last):
File "/home/user/project/venv/lib/python3.9/site-packages/dokuwikixmlrpc.py", line 115, in catch_xmlerror
return f(*args, **kwargs)
File "/home/user/project/venv/lib/python3.9/site-packages/dokuwikixmlrpc.py", line 199, in page
return self._xmlrpc.wiki.getPage(page_id)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1116, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1456, in __request
response = self.__transport.request(
File "/usr/lib/python3.9/xmlrpc/client.py", line 1160, in request
return self.single_request(host, handler, request_body, verbose)
File "/usr/lib/python3.9/xmlrpc/client.py", line 1190, in single_request
raise ProtocolError(
xmlrpc.client.ProtocolError: <ProtocolError for wiki.example.net/lib/exe/xmlrpc.php: 401 Unauthorized>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/project/showbug.py", line 12, in <module>
print(client.page('home'))
File "/home/user/project/venv/lib/python3.9/site-packages/dokuwikixmlrpc.py", line 119, in catch_xmlerror
raise DokuWikiXMLRPCProtocolError(fault)
dokuwikixmlrpc.DokuWikiXMLRPCProtocolError: <DokuWikiXMLRPCProtocolError 401: 'Unauthorized' at wiki.example.net/lib/exe/xmlrpc.php>
Is there any chance python 3.9 changes something that makes the XML RPC protocol fail in any way ? The only thing, apart new bugs in CPython itself, i can imagine right now is the new operators on dict, the new string methods, or some little details.
I tried running the code with -X oldparser
to use the LL(1) parser (as 3.9 introduce the PEG one), but that didn't changed anything.
The problem may have appeared in 3.8, which have a lot more consequent changelog, notably some changes in xml and xmlrpc modules that involve auth-related details.
I can reproduce the same problem with the following code:
from urllib.parse import urlencode
import xmlrpc.client as xmlrpclib
URL = 'wiki.example.net'
USER = 'user'
PASSWD = 'password'
USER_AGENT = 'DokuWikiXMLRPC 1.0 for testing'
script = '/lib/exe/xmlrpc.php'
url = URL + script + '?' + urlencode({'u': USER, 'p': PASSWD})
print(f'{url=})
xmlrpclib.Transport.user_agent = USER_AGENT
xmlrpclib.SafeTransport.user_agent = USER_AGENT
proxy = xmlrpclib.ServerProxy(url)
v = proxy.dokuwiki.getVersion()
print(v)
It seems to be a Xmlrpc lib related problem.
Thanks for reporting this. FWIW, I can the example from the OP successfully in Python 3.8.7 but get the same 401 you report when testing with Python 3.9.2.
I also found this thread you started. Have you had a chance to investigate further?
Hi ! No, i didn't had time for this. I remember starting quickly a XML RPC server with python, and reaching it with another piece of code found in the python doc, and noticing the the ServerProxy seems to not have any support for authentication.
I'm interested in finding a solution, so i just found the piece of code i used to test a basic auth method, and am currently trying to hack upon that.
I ended up with the following python code:
import sys
from xmlrpc import server, client
from urllib.parse import urlencode
PORT = 23456
USER, PASSWD = 'user', 'password'
TARGET = '127.0.0.1:' + str(PORT)
URL = TARGET
def func_add(a, b):
return a + b
if len(sys.argv) > 1 and sys.argv[1] == 'server':
serv = server.SimpleXMLRPCServer(('127.0.0.1', PORT))
print('Listening on', TARGET, '…')
print(dir(serv))
serv.register_function(func_add, 'add')
serv.serve_forever()
else:
url = 'http://' + URL + '?' + urlencode({'u': USER, 'p': PASSWD})
print(url)
proxy = client.ServerProxy(url)
# v = proxy.dokuwiki.getVersion()
# print(v)
print(proxy, dir(proxy))
print(proxy.add(2, 3))
Running python p.py server
in one terminal, then python p.py
in another, i got, client-side:
http://127.0.0.1:23456?u=user&p=password
<ServerProxy for 127.0.0.1:23456/RPC2> […]
5
And, server-side:
Listening on 127.0.0.1:23456 …
[…]
127.0.0.1 - - [27/Mar/2021 18:41:02] "POST /RPC2 HTTP/1.1" 200 -
Both in python 3.7 and python 3.9. I will now try to use, instead of an XML RPC server, a flask app. To have access to the whole set of informations received from the client.
Annnnd… That's a success !
Following code is implementing a webserver with Flask or a XMLRPC client:
import sys
from xmlrpc import server, client
from urllib.parse import urlencode
from flask import Flask, request
PORT = 23456
USER, PASSWD = 'user', 'password'
URL = '127.0.0.1:' + str(PORT)
if len(sys.argv) > 1 and sys.argv[1] == 'server':
app = Flask(__name__)
@app.route('/RPC2', methods=['POST'])
def login():
print('REQUEST:', request.args)
return 'ok'
app.run(debug=True, host='localhost', port=PORT)
else:
url = 'http://' + URL + '?' + urlencode({'u': USER, 'p': PASSWD})
print(url)
proxy = client.ServerProxy(url)
print(proxy, dir(proxy))
print(proxy.add(2, 3))
You can run the client with python3 p.py
, and the server with python3 p.py server
.
On debian, Python 3.7, i got:
aluriak@debianserver:~$ python3.7 p.py server
* Serving Flask app "p" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://localhost:23456/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 249-992-288
REQUEST: ImmutableMultiDict([('u', 'user'), ('p', 'password')])
127.0.0.1 - - [27/Mar/2021 18:31:13] "POST /?u=user&p=password HTTP/1.1" 404 -
On Arch, python 3.9, i got:
aluriak@arch❯ python3.9 p.py server
* Serving Flask app "p" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on http://localhost:23456/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 821-276-100
REQUEST: ImmutableMultiDict([])
127.0.0.1 - - [27/Mar/2021 18:35:45] "POST /RPC2 HTTP/1.1" 200 -
We got a non-similar log.
Both systems return the same output on command pip3 freeze | grep Flask
, which is Flask==1.1.2
.
Note that the two outputs differs in two ways:
127.0.0.1 - - [27/Mar/2021 18:31:13] "POST /?u=user&p=password HTTP/1.1" 404 -
127.0.0.1 - - [27/Mar/2021 18:35:45] "POST /RPC2 HTTP/1.1" 200 -
The first contains the arguments, and the second contains the path. Sounds like there is something wrong with both modules.
I will post that on the python mailing list. Thanks for giving me a push forward :)
Interesting, so it looks like Python 3.9 implicitly appends /RPC2
to the URL but strips the query string. That looks like a pretty egregious but to me.
According to the spec, /RPC2
is just an example path...
OK, the plot thickens: seems that the default path is /RPC2
if no path is given i.e. if your URL ends with a /
it works as intended. But the query string is discarded!
Previously, the handler
was everything after the host part of the URI, but that's no longer the case. That's a serious behavior change!
So, this is the relevant Python bug. I'll comment on it, pointing this out.
Well done ! Thank you for reporting it.
ChrisA from the python mailing list was able to reproduce and understand the problem, also pointing some details about the protocol, and finding the same python bug you found.
Reproduction of ChrisA answer:
One point of note is that the request as given actually doesn't have a
slash. I think that's technically wrong, but a lot of systems will
just implicitly add the slash. That, coupled with commit 9c4c45, is
why you're seeing "/RPC2" in there. That distinction vanishes if you
change your client thusly:
url = 'http://' + URL + '/?' + urlencode({'u': USER, 'p': PASSWD})
Actually, it looks like all the changes came in with that commit. The
old way used some internal functions from urllib.parse, and the new
way uses the public function urllib.parse.urlparse(), and there are
some subtle differences. For one thing, the old way would implicitly
readd the missing slash, thus hiding the above issue; the new way
leaves the path empty (thus triggering the "/RPC2" replacement). But
perhaps more significantly, the old way left query parameters in the
"path" portion, where the new way has a separate "query" portion that
is being lost. Here's the relevant BPO:
https://bugs.python.org/issue38038
It seems to have been intended as a pure refactor, so I'd call this a
regression. Fortunately, it's not difficult to fix; but I'm not sure
if there are any other subtle changes.
The regression's already been reported so I'm adding to this issue:
https://bugs.python.org/issue43433
Hopefully that solves the problem!
I made a PR accordingly, although it won't fix the problem.
FYI, there's a PR (python/cpython#25045) pending to address this upstream.
And another PR :) python/cpython#25057
I followed that too. That's a good news :)
Congrats !
@Aluriak given that this has been fixed in the Python standard library, can we close this?
Hi !
Yes it is ! Thank you for your help :)