txtorcon
txtorcon copied to clipboard
config.socks_endpoint fails when having multiple endpoints
Hi,
I have been running some simple examples for performing http requests with txtorcon and treq. Even a simple script has errors:
from twisted.internet.task import react
from twisted.internet.defer import ensureDeferred
from twisted.internet.endpoints import TCP4ClientEndpoint
import treq
import txtorcon
async def main(reactor):
tor = await txtorcon.connect(
reactor,
control_endpoint=TCP4ClientEndpoint(reactor, '127.0.0.1', 9051),
password_function=lambda: 'mypass',
)
state = await tor.create_state()
circ = await state.build_circuit()
await circ.when_built()
print(' path: {}'.format(' -> '.join([r.ip for r in circ.path])))
config = await tor.get_config()
resp = await treq.get(
'http://httpbin.org/ip',
agent=circ.web_agent(reactor, config.socks_endpoint(reactor)),
timeout=3,
)
data = await resp.text()
print(data)
out = await state.close_circuit(circ)
@react
def _main(reactor):
return ensureDeferred(main(reactor))
But I get a builtins.AttributeError: 'list' object has no attribute 'startswith'
error.
I have noticed that the issue is because of the config.socks_endpoint(reactor)
. When calling that method, I have self.SocksPort = _ListWrapper[['unix:/run/tor/socks WorldWritable', '9050']]
, which is a list of list. Then, when that method is calling the _endpoint_from_socksport_line
things get even worst, as we are passing a list instead of a socket endpoint.
I have tor service listening on port 9050 (and 9051 for control), and also in unix socket /run/tor/socks
(and /run/tor/control
for control).
A super ugly and nonsense patch to fix it is just to go to _endpoint_from_socksport_line
method and force the socks_config
argument (which is a list) to get the second value (which is '9050'). If I try the first value (which is 'unix:/run/tor/socks WorldWritable'), I will get an error too.
Can you also please include the version of Tor being used?
I have installed latest Tor from their official repository. My OS is Ubuntu 20.04 LTS.
https://2019.www.torproject.org/docs/debian.html.en
Latest Tor version there is 0.4.3.5-1~focal+1 (amd64)
socks_endpoint()
does take a port=
parameter too (it is supposed to use the first socks-port).
From the code, it looks like unix-sockets should work but I think that "space, and then more options" part in your config will make it sad. (I wonder how paths with spaces in them are handled? Probably quotes)
Also, thanks for the detailed report!
I think refactoring socks_endpoint
and _endpoint_from_socksport_line
and adding a parser or regular expression match will do the job for my Tor version. The problem is that if I create a fix, I do not think it will work with previous versions...
I can fix it with my suggested approach and try to cover previous versions using a container to make tests. Do you think that there will be more parts involved with this issue?
SocksPort 9050
(for example) should still be a valid thing in new (and old) configs, right? What do you think won't work with previous versions?
I haven't had time to try this exact setup yet -- one thing that does look odd is the "double lists" in the ListWrapper
you printed .. as far as I can recall, Tor has "always" allowed multiple SocksPorts
If you want to attempt a fix, that would be great! Personally, I try to avoid regular-expressions but also sometimes they are the right option :)
Could you please show me the SocksPort
value in socks_endpoint
method? I got that _ListWrapper[['unix:/run/tor/socks WorldWritable', '9050']]
but it would be great to know what should be there.
It's kind of just doing "dumb matching", so if you did config.socks_endpoint(..., port=9050)
then it should find the existing SocksPort 9050
line (https://github.com/meejah/txtorcon/blob/main/txtorcon/torconfig.py#L629)
The _ListWrapper
thing is a little weird; it's so that TorConfig
can present a "synchronous" / attribute interface but can intercept new values (so if you .append()
a thing, it can know to write that value out later).
With the benefit of hindsight, that whole API might not be the best approach especially for an async thing .. but at the time I was thinking to save people from yield config.get_whatever()
type of calls. With await
being usable in Twisted now, that's less of a concern. But, it's the API I get to support now :)
So, I think the value should be _ListWrapper['unix:... WorldWritable', '9050']
.. that is, a single list with two strings (what you've got looks like a single list with a single entry: another list).