irc
irc copied to clipboard
Asyncio Client improvements
After thoroughly examining the module, I found multiple improvements could/should be done to asynchronous code. I'd like to open a discussion before actually implementing the code because some changes will break the current API.
- [ ] (breaks API)
AioSimpleIRCClient
currently cannot be created and started once event loop has started and running asynchronous code, becauseAioConnection.connect
callsloop.run_until_complete
. In the situation described above, aRuntimeError: This event loop is already running
exception is raised.AioConnection.connect
should be a coroutine. Waiting for it's finishing should be caller's responsibility. Changing this function to a coroutine definitely breaks the current API. - [ ] Connection reset by peer should be handled.
- [x] SSL should be supported. Current async code does not support ssl.
connection.Factory
should be adapted to async code as well. - [ ] Circular reference between
AioConnection
and its underlyingprotocol
should be removed. - [ ] Async version of
bot
should be added. - [ ] Debug usage of
print
should be removed.print
is used 3 times inclient_aio.py
. I assume the developer forgot to remove them. - [ ] Make the use of custom event loops easier. If one wants to use
AioSimpleIRCClient
with a custom event loop, subclassing bothAioReactor
andAioSimpleIRCClient
is required. It is supposed to be as simple as callingAioSimpleIRCClient(loop=loop)
- [ ]
loop
should not be accepted as a positional argument. For example,AioReactor.__init__
should be declared as
def __init__(self, on_connect=__do_nothing, on_disconnect=__do_nothing, *, loop=None)
- [ ]
asyncio.Lock
should be used overthreading.Rlock
. - [ ] Async throttler
- [ ] Required version check
- [ ] Better tests.
Those all seem like reasonable ideas. The AsyncIO support is fairly new, so as long as @MattBroach is comfortable with the changes, I'm happy to accept them (even if backward-breaking).
Sounds great. I'll go ahead and start the implementation. My own project needs it anyway :D
Do you mind if I also clean some code for the non-async part along the way if I happen to find any? I won't make big changes. If so I'd make a new PR. E.g. in connection.py
, identity
function could be replaced by lambda x: x
.
Just popping in to say this all sounds great to me. As @jaraco sad, the asyncio stuff was just added, so now is the best time for breaking changes. Thanks for the thorough review.
Guys, I start to feel that it's hard to maintain two versions in a single module. A lot of non-async code depends on raw sockets where async code uses protocol and transport extensively. Plus that non-async code needs to support both py2 and py3, which makes it harder to separate pieces and bits to implement py3-only code. I think it's easier to work on a fork and make another package that only supports async irc code and release separately. What do you think? I am willing to maintain the fork.
Do you mind if I also clean some code for the non-async part along the way if I happen to find any?
In general, this sort of contribution should be handled as a separate PR to be accepted first. You can still implement your asyncio improvements on top of these incidental changes, but then they'll be dependent on the acceptance of those changes and might require resolving of merge conflicts if not accepted exactly as presented. But this project is happy to accept all sorts of changes as long as the concerns can be kept separate. If the changes are small and non-controversial, it's fine to implement those in the same PR, just use separate commits (same guidance, just more lightweight).
E.g. in connection.py, identity function could be replaced by lambda x: x.
You'll find the linter won't accept identity = lambda x: x
(although that would have been my stylistic preference as well). You could move the change inline (i.e. wrapper=lambda x: x
), but that conveys less information.
Honestly, if I were you, I'd try to look past these stylistic changes and focus on the semantic or structural changes. But again, happy to review any proposed change.
I think it's easier to work on a fork and make another package that only supports async irc code and release separately. What do you think?
A fork, especially in a different package/module, is just asking for trouble. It means creating two lines of development on a project that already has very little development.
I do see the appeal - that maintaining simple socket compatibility is hard - hence why we have something of a fork already with two implementations. I'd rather not increase the divergence with yet another package. It seems reasonable at first until we get the next bug report that affects the common parts of the two libraries - then we have to expend extra effort to ensure that both implementations get that functionality (or otherwise manage the increasing cognitive distance between the two projects).
Ideally, we could refactor the code in such a way that any event loop / IO handler could be chosen, and the rest of the functionality lies atop that, although I suspect that's easier said than done (if possible).
If a proper loop abstraction isn't possible or achievable, I'd be more inclined to drop support for Python 2 and a select loop, and rely entirely on async IO and Python 3 for the library (a backward-incompatible release).
So in that sense, I do support a fork, one in which the select-based functionality is left behind in a maintenance branch that's unlikely to get backports of functionality from the master. If you can bring the asyncio functionality to feature parity with the select loop, then I'm happy to accept that as the sole event loop for the library.
To the extent that you can, try to implement the changes with many minimal steps with stable commits.
Thanks!
How about this, I'll put all async code in a subfolder, i.e. irc/aio/
which has a similar directory structure as irc
and start to refactor the code, separate shared code into base classes (BaseServerConnection, etc). For things that are hard to split, I'll copy some code from irc
to irc/aio/
and modify it.
Btw some of the dependencies (jaraco/*
) does not support async code as well (e.g. Throttler
). How do you want me to deal with it? Add more things to your modules (which appear not to be open source) or implement it in another file in this particular repo?
Btw, I do think wrapper=lambda x: x
makes things super clear as it tells the reader wrapper is supposed to be a function/method/closure and its default value is returning the socket untouched (with that in mind maybe wrapper=lambda sock: sock
is even better. wrapper=identity
requires me to check what is identity and what it does by searching the name in the file.
I'll put all async code in a subfolder
That seems reasonable. I'd especially like it if there were symmetry with another package that served for non-async mode. For example, there's irc.aio
and irc.select
and each implements equivalent functionality. But that's not a prerequisite.
some dependencies do not support async code
Ugh. Well, I guess that's an unfortunate situation, but not surprising. Probably best would be to make those features compatible with async as well. I'd be happy to accept those improvements as well. I would like them to be contributed back to the canonical project, all of which are open source (if they're not, that's a bug).
I see Throttler
is imported from jaraco.functools, found at GitHub.
If you wish to test with unreleased versions of jaraco.functools, it's easy to:
irc $ .tox/python/bin/pip install -e ../jaraco.functools
Then run tests or exercise the IRC library against the in-development version of jaraco.functools before committing/pushing/pull-requesting.
Btw, I do think
wrapper=lambda x: x
makes things super clear...wrapper=lambda sock: sock
is even better.
Yes, I agree. I think I was planning to move identity
to another library (as I've used that pattern in more than a dozen places). It's the kind of thing that I feel should be in the stdlib, like None
only for functions. But since it's not, it has to be explicit throughout the various implementations.
That said, there is a tension with lambda
. Many Python developers, especially those unfamiliar with functional programming paradigms, find lambda objectionable. I'm perfectly comfortable with it and my sensibilities align with yours here. Just be aware that others may come along later and wish to change it (back).
In any case, sounds like you're on the right track.
Since this discussion began, this project has dropped support for Python 2 and the async implementation has expanded, so progress is being made.