TorChat icon indicating copy to clipboard operation
TorChat copied to clipboard

Working nicely with other implementations.

Open meh opened this issue 12 years ago • 11 comments

Hello, I sent you an email but I guess I'm gonna open an issue because I'm not sure you're gonna read it, at least here the discussion will be public and other parties will be able to join into it.

I'm the author of ruby-torchat and I'm trying to get other implementors to work together nicely by moving to an extension driven protocol and documenting/designing extensions together.

Right now I'm only in contact with the jTorchat folks and without getting in touch with you too everything would be useless because you could end up adding parts to the protocol that we already implemented as extensions and with our design and we'd end up having to do ugly stuff to stay compatible.

I documented the standard protocol and two extensions and you can find the docs here.

You don't really have to change anything right now in your protocol implementation to fit with that stuff, all I'm asking is to not start adding new packets to the protocol natively but doing it as extensions and maybe getting in touch with the other implementors to see if we already have designed something or are already working on it.

meh avatar Apr 25 '12 09:04 meh

I'm not going to add any new protocol messages any time soon because now I am working on a rewrite in Object Pascal and I will make it 100% compatible with the current Python implementation, not adding any new protocol features and also not removing anything.

I also have not yet decided how the groupchat protocol should work and if there should be separate types of groupacht. From quickly looking at your specifications (not studying it thoroughly) it seems there is now some kind of invite-only group chat with users already on the buddy list. This seems ok to me.

I also had something in mind (this was at the time i invented the "add_me" message that one can run an anonymous open channel (like IRC) for anybody, even people wo are not on the buddy-list. This is why I invented the add_me message, so that one can run an open "channel" an people can connect without appearing on the buddy list, instead of sending "add_me" they would send some kind of join message. This would go along with some kind of url scheme to describe the names of channels, for instance welcome@trtzutzutztrrtru so one could join the room "welcome" on trtzutzutztrrtru's server without first having to add trtzutzutztrrtru's as a buddy. The client would then try to connect and join and report an error if it fails but never attempt to add the buddy.

The existence of protocol extensions should be probed by sending the packet that would normally initiate the feature and then reacting to the not_implemented message, I would not suggest inventing a lot of these supports_xxx messages (although I don't have a compelling reason for not doing it, except for trying to keep it simple). I can vaguely remember having a discussion about this a year ago and for some special corner cases (can't remember exactly what it was) we found that it would be better and also generally be more elegant to always just probe things and react to not_implemented but I can't remember what the exact reason was.

At the moment I am working on a pidgin plugin (you find it in the torchat2 branch but its still only not much more than an empty skeleton at the moment) (this will be the first result that has to be expected from the rewrite), this would also allow such things as for example using pidgin's OTR features. A native OTR solution for TorChat (should it ever be implemented) should therefore be compatible with the way a pidgin created OTR message would be formatted (without extending the torchat protocol).

prof7bit avatar Apr 25 '12 12:04 prof7bit

There is only one supports packet, and it simply contains a space separated list of extension names, for instance supports groupchat typing, it tells the other end we support those two extensions. This packet needs to be part of the standard protocol and is sent after client and version.

Then the extension packets just start with the extension name, like groupchat_message.

Reacting to not_implemented doesn't look like a good idea to me, it's clunky and really bad design.

Temporary buddies are indeed needed, in fact I implemented them properly in my implementation, and it has to be kept as it is.

About those rooms where anyone can get into, as I proposed to the jTorchat folks there will be extensions to the groupchat extension, for instance they care about implementing a redundancy protocol to keep alive groupchats and the like.

Groupchats are designed to have modes, right now the modes of the room are sent when inviting someone, for example groupchat_invite id noredundancy topic would tell the invited that the groupchat doesn't have redundancy and that topics are supported.

I saw you were working on the pidgin plugin, I'm trying to push torchat everywhere I can, writing the bitlbee plugin and I'm gonna write a plugin for Instantbird, there's also a friend of mine that's gonna work on an empathy plugin. With my implementation you could also easily write bots.

As I said before, all I care about right now is working together on protocol changes and extensions and to document everything about them, and also avoid reinventing the wheel and getting incompatible together. If you don't have plans to add anything or change anything for me it's all good, and I suppose the same counts for the jTorchat guys.

Hope torchat will become really popular.

meh avatar Apr 25 '12 12:04 meh

ok, the supports_* messages are not such a big problem, its basically ok. It was only one theoretical corner case in one crazy scenario that I discussed with someone but never implemented and can't remember correctly anymore where the not_implemented would have been the more robust solution.

But I have found a much bigger problem: In one of the specifications I have seen messages with ? and ! at the end. This is totally impossible. In both of my implementations the message names are directly translated into class names and only ascii letters and numbers would be allowed. There is really no reason to allow these special characters in the protocol messages.

I would have to write some ugly and totally unneccesary code to encode/decode them to and from the allowed character set only to save maybe 3 bytes in the message name. I'm not going to implement these message names. The protocol should clearly specify the allowed characters for message names for example like this [a..z] or _

prof7bit avatar Apr 25 '12 19:04 prof7bit

The usage of ? and ! is not for saving space but for making the packet name describe what the packet does.

If you have better names I'll change them.

meh avatar Apr 25 '12 19:04 meh

for example this:

groupchat_participants? 6f012391-883d-4f4f-8d54-52c4227b3ac9 < groupchat_participants 6f012391-883d-4f4f-8d54-52c4227b3ac9 bgboqr35plm637wp

I would suggest:

groupchat_participants_query 6f012391-883d-4f4f-8d54-52c4227b3ac9 < groupchat_participants 6f012391-883d-4f4f-8d54-52c4227b3ac9 bgboqr35plm637wp

and for the exclamation mark in the other example I would just leave it away, a command is implicitly an exclamation anyways in some abstract sense (if it is not explicitly a query)

prof7bit avatar Apr 25 '12 20:04 prof7bit

Well, wouldn't that just be a simple replace ? with _query and replace ! with _bang or something?

I think ? and ! make the protocol more readable, and that's fairly important, on your side it's a super simple thing to do and in languages that support those characters it makes everything more readable.

You already have to do weird replacements to encode/decode the data, I don't see why encoding/decoding the name packet would bring much problems. The logic is already there, just add another encode/decode.

meh avatar Apr 25 '12 20:04 meh

I'm not going to use any of these weird characters in the protocol messages because there is no need for it, absolutely no need. And its ugly and would destroy simplicity. Also my binary encoding/decoding is not "weird", it is extremely simple and straightforward. If you find it weird then please show me an even simpler encoding scheme with less overhead that is not "weird".

I have updated the source comments in tc_client.py to explicitly state that protocol messages may only consist of lowercase ascii letters and underscore. Its the same set of characters that is allowed for identifier names in almost all programming languages. I did this for a reason. It is the same reason that keeps me from wanting to create file names with weird characters, path names with spaces, function or variable names with question marks, etc.

prof7bit avatar Apr 26 '12 01:04 prof7bit

I'm not going to use any of these weird characters in the protocol messages because there is no need for it, absolutely no need.

The need is making the packet intention clear, there is no need for making the packets ugly when it's just a simple replace.

>>> "groupchat_participants?".replace("?", "_query").replace("!", "_bang")
'groupchat_participants_query'
>>> "groupchat_participanting!".replace("?", "_query").replace("!", "_bang")
'groupchat_participanting_bang'

Is it that much of a problem? Please think about it, clear packet names are important.

Also my binary encoding/decoding is not "weird", it is extremely simple and straightforward. If you find it weird then please show me an even simpler encoding scheme with less overhead that is not "weird".

JSON, BSON, a text and binary based protocol. BSON would be the best choice for a protocol that needs to send binary data.

meh avatar Apr 26 '12 10:04 meh

Why are you insisting on the "?" and "!"?

To illustrate my point let us first talk about naming of functions in programming languages. Consider the following functions in python

def groupchat_get_participants()

or maybe even more obvious something like this

class groupchat:
  def get_participants():

You are suggesting that something like this:

  def participants?():

would be more readable and understandable than

  def get_participants():

This wouldn't even compile. And nobody would ever even try to do this, even if some exotic language would allow it nobody would make use of the "?" to abbreviate a "get" in a function name. The same reasoning applies to protocol message commands. Somebody at a later time might want to translate them into Objective-C or Obective-Pascal message names, XML tags, http REST URIs (maybe just something as simple as a wiki page documenting the protocol, somebody might want to search google for documentation or an error message regarding a command name, etc.). I don't understand why we are still discussing something as trivial and obvious as this. Maybe it really needs 25 years of programming experience to be able to intuitively understand immediately what would really be a bad idea and what is destined to create problems later, I don't know...

"groupchat_get_participants" or "groupchat_query_participants" or maybe even "groupchat_participants_query" is by no means less readable or less intuitive than "groupchat_participants?". As far as I am concerned even the opposite of what you claim is the case: I find "groupchat_participants?" even less clear, especially in the presence of other commands that don't have the "?" or "!" attached to them.

and the "!" makes even less sense because everything that is not a "?" is automatically implicitly a "!" anyways. For example what about all the other already existing commands that are not questions, should I now break the existing protocol and make all commands that do not query anything end with "!" to make it consistent? Or should the protocol only have some commands with "!" and others not? This would be even more confusing.

Why should I introduce an incompatible change in the protocol without any added functionality? Or should I (and all other implementers) now implement two versions of every "!" message, one without "!" for backwards compatibility and one with "!" to make it look consistent with your proposal for the new messages? What benefit, what added functionality would this give? Exactly none, only confusion and additional code for nothing!

These additional characters do not allow any additional functionality, so there is no need to introduce additional confusing code to support them for no reason! ascii letters are enough to express any possible command in plain english. Apply the KISS principle!

if "groupchat_participants_query" is not understandable enough for the average programmer then I'm going to make a new proposal (you promised earlier to change your's based on my suggestions, so here is my new suggestion): how about these two messages:

groupchat_get_participants groupchat_participants

And now let's finally stop talking about these question marks, they are not up for discussion. I would care more about how the messages should behave and interact, their structure and when to send them and how to react. This is more important. The names will be short english words or phrases that consist of [a..z,_] which can clearly express any meaning in the same way as if one would try to find a clear and expressive name for a variable, a method or a class. Semantically these commands are (remote) procedure calls anyways, so there is nothing wrong with applying the same naming conventions as one would do for procedures in a programming language.

Regarding BSON: Its much more difficult to parse and it relies on low-level machine dependent data structures, endianness, etc, TorChat protocol was designed to be easily parsed with only a handful of code lines even for the most simple scripting languages and it has the additional benefit (for debugging purposes) of being human readable (and even writable) in a telnet session on a terminal. It's too late to make such deep and incompatible changes anyways and also (just like with the question marks above) I do not see any advantage over the existing implementation.

If it ain't broke, don't fix it!

We are talking about changes in an established protocol that is widespread in use for 5 years now, changes that will become equally widespread. With this there comes responsibility. I am willing to carefully implement new features without breaking anything but I am absolutely not willing to introduce bloat and confusion and ugliness and inconsistency into this beautifully simple and elegant protocol and its reference implementation where there is absolutely no need to do so.

I am not going to break it, I'm not going to deface and destroy my own creation! This is my last word on this!

And now let's move on to something more constructive. Let's actually design new protocol features and not argue for many days over the stupid spelling of a stupid message command name. [a..z,_] is plenty of characters to chose a meaningful name from once the general design and meaning of the new messages is ironed out.

prof7bit avatar Apr 28 '12 10:04 prof7bit

Fine, I changed them.

meh avatar Apr 28 '12 10:04 meh

Speaking of other TorChat implementations... I guess this is a good time to chime in. I wanted to create p2p, encrypted and anonymous messaging based on hidden services on top of Tor and then I found that TorChat already does exactly this. So I started studying and documenting the implementation of TorChat as there is no official spec, here: https://www.meebey.net/research/torchat_protocol/ and that will be used as reference for the implementation in Smuxi 0. The implementation is currently still WIP but the git branch can be found here: https://github.com/meebey/smuxi/tree/experiments/torchat

meebey avatar Jan 02 '14 14:01 meebey