libwebsock icon indicating copy to clipboard operation
libwebsock copied to clipboard

Complete re-write underway

Open JonnyWhatshisface opened this issue 9 years ago • 6 comments

I started the initiative quite some time ago to completely re-write libwebsock after Payden's passing. An unexpected international move set me back a bit, but I'm finally settled in across the globe and back in to production. A majority of the core of libwebsock has been entirely re-written, and progress continues to be made on a daily basis. I'm working on it a few hours per a night and have gotten the bulk of it done. I'm anticipating a couple more weeks before a full on beta of 2.0.0 is available for hammering.

Let me be clear: The rewrite was to add more functionality and clean up the code base to make it "production worthy." Some of the enhancements, and their status, are as follows:

  • Cleaner code base, more organized - One single header is now installed in /usr/local/include (websock.h) - making only the libwebsock API's available to the user. This reduces the overall footprint of the library, and also reduces the amount of crud loaded in to the user code. [ DONE ]
  • libwebsock now has a scheduler - That's right, folks. The original method libwebsock used of spawning a thread per connection and thread per a message would not scale very well. Imagine 200k users connected to the current version of libwebsock all sending multiple messages... That would mean a pthread_create call per a message sent, which could be a little chaotic on the CPU cycles. The scheduler implementation has three separate schedulers now: a connection scheduler to handle incoming connection requests, a dispatcher scheduler to handle dispatching of messages and a control frame scheduler to handle any incoming control frames. Each scheduler has a user defined amount of workers with a default value of 16 (at least, during my testing as my dev box has sixteen cores available to it). [ DONE ]
  • Each client connection now gets tied to an event base on one of the scheduler agent threads. Each thread has its own event base. [ DONE ]
  • Listenserver context addition - libwebsock now easily allows the user to bind the library to their own socket implementation, making integration of libwebsock even simpler in to existing projects. For those who choose, the ability to have libwebsock create a non-blocking port and listen on it is still an option, and has been named a "listen server." [ DONE ]
  • Improved memory management - A GC is under way and in the works for libwebsock to ensure proper memory allocation, tracking and cleanup, allowing future work on libwebsock to reduce the focus on memory management. If a free() is forgotten somewhere, the GC will take care of it. [ In Progress ]
  • Client "tags" - One thing I've seen a lack of understanding in the community on is how to track the sockets to specific clients in a useful manner. It has been required to track and associate connections to specific "people," if you would, within user-land. This is still a feasible option, but libwebsock2 introduces a new "tag" on the client structure. libwebsocket did something similar, but didn't include a way to do anything useful from directly within the library with that tag. libwebsock2 introduces the new libwebsock_send_text_by_tag(); function, as well as a libwebsock_send_binary_by_tag(); function, which will allow you to send to the specific session(s) that have the specified tag. example: libwebsock_send_text_by_tag("jonnywhatshisface", "Private message for Jonny Whatshisface!"); [ DONE ]
  • Memory used per connection greatly reduced - libwebsock pre-2.0 held a LOT of things within the client state context that wasn't really required with the way libwebsock2 is laid out. For instance, the ssl state... The truth is that if SSL is enabled from the context-level, you aren't going to have some clients using SSL and some not. Either you use it, or you don't. [ DONE ]
  • Geared towards threaded applications - The old libwebsock was really intended for use in single threaded applications. The new libwebsock opens up doors for implementation in threaded apps. The contexts now take assignment of the lws_listenserver struct holding the fd and port number of the listen server, be it yours or the built-in one of libwebsock. Please note that the introduction of this new structure serves multiple purposes, one of which is the new type (lws_sockfd) which eliminates issues with pointer truncation on WIN32 platforms. (did I mention more support for WIN32 is under way?) [ DONE ]
  • non-deprecated OpenSSL code - The old version worked fine with SSL, but I'm not a fan of seeing warnings when I compile anything. The OpenSSL implementation caused an endless stream of gawking from my GCC due to deprecated functions. This is no more. [ In Progress ]
  • Support for Sec-WebSocket-Protocol - This was added in and a pull request generated by another user. His addition made it in to libwebsock2, and he has been given credit for it. [ DONE ]
  • uri Support - Yet another contribution from another user in their fork that a PR was generated for. Again, credit was given where credit is due. [ DONE ]
  • Examples - The library is only as good as the examples provided with it for its use. The new release will come with multiple examples, showing usage in the form of chats, instant messaging, file uploads and even use in multi-threaded applications. [ In Progress ]
  • MAN pages! - Yep, there's now libwebsock man pages. They look pretty, too. They outline all the goodies about libwebsock, and all the API calls anyone will ever need. [ In Progress ]
  • Documentation, documentation and more documentation - It's turning in to a real project now. I really wish Payden could be here to see it, but massive contribution to the initial development is heading towards the growth of something great. [ In Progress ]

Completion Status: All together, it's about 65% complete, and I'm anticipating about another three weeks or so until it's ready for beta testing.

JonnyWhatshisface avatar Dec 07 '15 16:12 JonnyWhatshisface

I've started writing prototype code that won't be in use in the initial 2.0.0 release (most likely 2.0.x), but will come with it. The intention is to pre-allocate a slew of job wrappers and client state contexts to cut down on the amount of malloc calls on connections. I essentially used the VM that's being developed for the GC to allocate multiple stacks - one for data to be GC'd, one for client state wrappers and one for job wrappers to be sent to the scheduler agents. I retrieve them with pop's from the virtual stack and push them back on once finished with them. If a wrapper is requested and none are available, 50 more are allocated each time. They are then cleaned up by the GC after so much time of not being used/needed, leaving the original dynamically allocated wrappers remaining intact. I'm still testing this to make sure it's going to be thread safe. That number of 50 is configured with a definition in the header, so it can be changed. I don't intend on making it changeable via the API, just at compile time.

A test run tonight yielded a successful 600k connections without a hiccup. CPU utilisation spiked pretty decently during the connections, but levelled out at next to nothing once everything was connected. I'm still working on writing the test/benchmark code which will be released with the library to output recommended maximum connections for the implementation it's running in.

I'm hoping to have something up relatively soon. I've been hammering on these keys for days and nights now.

JonnyWhatshisface avatar Dec 09 '15 22:12 JonnyWhatshisface

Refactored the bulk of the scheduler code and reduced its footprint significantly. It's interesting how something so complex can begin to look so simple from a code-base perspective. libwebsock2 has, currently, 5 agent types running on the scheduler. One agent is the connection agent, which handles the client connections. The dispatcher agent is used for dispatching messages within libwebsock (messages, frames, connect and disconnect notifications). The control frame agent handles all aspects of control frames, so as not to interrupt the flow of current frames coming in to or from current client sessions. This also enabled the ability to inject and respond to control frames in the middle of other things taking place on the protocol without interrupting the traffic flow to do so. The fifth (and sixth) agents will be a surprise, as I don't want to entirely give away all the secrets before the release.

Some connection testing last night showed promising results. I was able to successfully load up more than 900 connections in to libwebsock2 in less than one second. The trick with libevent is not to do an event base per connection, as I've seen other multithreaded examples doing, but instead, to have multiple event bases and distribute the connections across them. One event base is used to handle incoming connections, and the connections are then passed off to the scheduler where they're handled from there and assigned to an event base based on most readily available. Each event base is technically capable of thousands of connections. The amount of event bases can be customized.

The scheduler has been written in such a way that when passing a job wrapper to it, you only need to assign the type on the wrapper and the scheduler will determine which agent to send it to. Currently, the wrapper for one job may be reused and passed around to complete multiple jobs before being free'd. I still have not implemented the dynamic pre-allocation of wrappers in advance because I've been concentrating on refactoring the scheduler code to clean it up, simplify it and enhance functionality.

This is just an update so people realize that I am still working on it, and it is making progress.

JonnyWhatshisface avatar Dec 23 '15 06:12 JonnyWhatshisface

I was hoping to get this out before Christmas, but I hit a few things that I really want in the release that I'm patching in and testing.

I've added the ability to tie the headers sent on the connection to each client state. One may question why, but I'll give you a good reason: I needed it. I'm currently working on a site that's using libwebsock on its back end, and it is partially PHP driven. I need to be able to pull the Cookies in to my websocket server, and I figured that if I have a need for that? There's no doubt someone else may. Then I realized others may just want access to the headers, period. So, with each connection, the headers can be stored. This is optional (ctx->keepheaders = 1 to enable, and they'll be in state->headers).

I've also added the following as of today:

  • Origin Checking - Can be enabled and configured via the libwebsock context.
  • Sec-WebSock-Protocol - If more than one is sent, it only grabs the first one and uses it.
  • Socket close detection/cleanup - The old version tried to compensate for this, but it really wasn't complete. I can now detect any and all socket closes and I clean them up appropriately.

Soon, my friends... Soon.

I'm trying to find someone who might be able to write a wrapper for libwebsock in Go, Node and Perl. It'd be nice to have those included in the new /examples .

Please keep in mind that libwebsock2 is a complete re-write, so a LOT has changed. The old API's are no good anymore, but, the callback method still applies as is. Along with the examples, I'll be releasing an example showing how to use libwebsock in a multi-threaded application to get the most out of the CPU. libwebsock2 itself is multi-threaded, hence the scheduler/agents.

JonnyWhatshisface avatar Dec 25 '15 12:12 JonnyWhatshisface

So, Origin checking is now a feature and working. URI support (get /whatever) is also under way as we speak. A few important notes:

I found some issues related to libevent. I'm working with the libevent dev's to get it sorted. However, I'm also looking at libev. This won't delay the release of libwebsock2, but I may change the libevent back end. libev has full libevent api emulation, so it won't be too much of a change. But beware that it may happen and may be a new suggestion.

It seems to handle things pretty darn well on my little core i5, loading up 168 connections in less than a second... I really didn't expect it to move that fast. It's going to need some testing and optimizations as time goes on in order to find what the best way is to handle the scheduler for new connections, handshakes, etc... Right now, EVERY single aspect is scheduled as individual jobs. Every packet coming in... May do some good to let the event base threads handle the incoming connections, and run with only one thread until it completes? However, as it stands, different cores are picking up and grabbing the jobs - indicating they were more readily available than even the last one that performed the previous job. So it may be pretty optimized now... Only time and testing will tell.

Just a few more days, guys. :)

[lws_worker_agent_function]: Worker type [3] number [1] is executing job... [lws_add_client_to_evbase]: Assiged evbase [0x7ff7b1c06860] for connection from [127.0.0.1]... [lws_worker_agent_function]: Worker type [0] number [1] is executing job... [lws_worker_agent_function]: Worker type [0] number [3] is executing job... [lws_worker_agent_function]: Job freed! [lws_handle_accept]: Connect worker [2] handling accept for [127.0.0.1]... FD: 186 - 168 connected clients [lws_handle_accept]: Event base at [0x7ff7b1c06860] added to job! [lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [0] number [0] is executing job... [lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [0] number [3] is executing job... [lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [0] number [1] is executing job... [libwebsock_handle_recv]: Received the following headers:

GET ws://localhost/?test Origin: http://localhost Connection: Upgrade Host: localhost:8080 Sec-WebSocket-Key: uRovscZjNol/umbTt5uKmw== Upgrade: websocket Sec-WebSocket-Version: 13

[lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [2] number [0] is executing job... [lws_dispatch_message]: Message dispatched successfully... [lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [2] number [1] is executing job... [lws_dispatch_message]: Message dispatched successfully... [lws_worker_agent_function]: Job freed! [lws_worker_agent_function]: Worker type [2] number [2] is executing job... [lws_dispatch_message]: Message dispatched successfully... [lws_worker_agent_function]: Job freed!

JonnyWhatshisface avatar Dec 28 '15 14:12 JonnyWhatshisface

Any progress on this? I'm looking for a C/C++ WebSocket implementation and this one looks good so far...

snej avatar Dec 26 '16 21:12 snej

~~Oops, I just noticed the license is GPL. That means I can't use it at all; the rest of my code is Apache licensed, and the library it's part of needs to be useable by non-GPL'd apps. Too bad.~~

I was basing the above on the COPYING file, but hadn't noticed that the header comments in the source code state that the license is LGPL. It's possible I can work with LGPL code, although management might balk. But I'd still appreciate some confirmation that this project is still alive...

snej avatar Dec 26 '16 21:12 snej