feathers Scaling Docs

I already found a lot of hints about how to get a Feathers application scaled (e.g. in #157) but haven't had any luck yet getting it working with Socket.io (plain REST works smoothly). Am I right, that using feathers-sync does not circumvent the necessity of Sticky Sessions (on the LB & the node processes themselves) to enable Socket.io's handshakes? My current use case is using pm2 to simply launch the application in cluster mode: e.g. pm2 start src/index.js -i 4 -f.

What additional steps are required to get Feathers with Socket.io working "seamlessly" across several processes or physical resources? Is there a reliable solution to make Session-Handling more stateless using a Redis-Adapter for both express' and Socket.io's sessions?

May 08 '16 06:05 beevelop

feathers-sync does not require sticky sessions. It assumes however that once a client has established a websocket connection it will stay connected to the same server for the entire connection. I am not aware of a load balancer that does otherwise and it was working quite well with hosting providers like Heroku and Modulus. Did you run into any problems setting feathers-sync up as documented and start the cluster?

May 08 '16 16:05 daffl

AFAICT sticky sessions (on the application level) are required to support long-polling connections and the general Socket.io handshake when running the application with Node.js' cluster features. If I got this right, feathers-sync enables you to emit and receive events across several application instances for already connected clients. Nonetheless the handshake process of Socket.io seems not to be stateless and stores the handshake tokens / cookies in the local memory. Thus running the application process-clustered (e.g. via pm2) without a load balancer (with sticky sessions) leads to failed handshakes and a lack of long-polling support.

May 11 '16 12:05 beevelop

That can be solved with what is described how to use Socket.io in a Cluster though right? Feathers doesn't do anything that should interfere with that module.

May 11 '16 15:05 daffl

Yes, @daffl. However, using sticky-session is not recommended as per https://github.com/primus/primus#can-i-use-cluster

I've ended up setting up a simple haproxy with sticky-sessions capabilities in front of my processes, and it is now working fine. Here is my simple config:

global
  log 127.0.0.1 local0
  log 127.0.0.1 local1 notice
  chroot /var/lib/haproxy
  stats socket /run/haproxy/admin.sock mode 660 level admin
  stats timeout 30s
  user haproxy
  group haproxy
  daemon

  # Default SSL material locations
  ca-base /etc/ssl/certs
  crt-base /etc/ssl/private

  # Default ciphers to use on SSL-enabled listening sockets.
  # For more information, see ciphers(1SSL). This list is from:
  # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
  ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS
  ssl-default-bind-options no-sslv3

defaults
  log   global
  mode  http
  option    httplog
  option    dontlognull
  timeout connect 5000
  timeout client  50000
  timeout server  50000
  # http://blog.silverbucket.net/post/31927044856/3-ways-to-configure-haproxy-for-websockets
  option  redispatch 
  # very important if you have this server behind cloudflare like me! (check above link)
  option  http-server-close 
  # https://support.cloudflare.com/hc/en-us/articles/212794707-General-Best-Practices-for-Load-Balancing-with-CloudFlare
  option http-keep-alive  
  # https://support.cloudflare.com/hc/en-us/articles/212794707-General-Best-Practices-for-Load-Balancing-with-CloudFlare
  timeout http-keep-alive 300000 
  errorfile 400 /etc/haproxy/errors/400.http
  errorfile 403 /etc/haproxy/errors/403.http
  errorfile 408 /etc/haproxy/errors/408.http
  errorfile 500 /etc/haproxy/errors/500.http
  errorfile 502 /etc/haproxy/errors/502.http
  errorfile 503 /etc/haproxy/errors/503.http
  errorfile 504 /etc/haproxy/errors/504.http

frontend private_api
  bind *:3333
  default_backend api 

backend api
  timeout server 600s
  balance roundrobin
  # http://blog.haproxy.com/2012/03/29/load-balancing-affinity-persistence-sticky-sessions-what-you-need-to-know/ 
  # this is socket.io's specific cookie name  
  # double check it if your using any other transport/framework (primus, uws, ws, etc...)
  cookie io prefix nocache  
  server server-0 127.0.0.1:3344 check cookie server-0
  server server-1 127.0.0.1:3355 check cookie server-1

I've just tweaked the sample haproxy.cfg file to include what I needed. Mind the commented lines.

Also, if you're using pm2 to manage your app in cluster mode (making the most out of your machine's cpu cores), make sure you're now launching it in N (N = cpu cores) different processes, each with its respective port, and all in 'fork' mode instead of 'cluster'. This clustered mode was what got me into problems in the first place... As for the time being, pm2 does not allow sticky-sessions at its application level: https://github.com/Unitech/pm2/issues/389

Hoping someone finds this helpful...

Oct 10 '16 17:10 p-diogo

Using the fork mode is not always desirable. If used properly, the cluster mode will create the same number of processes as CPU cores, thus maximizing the resources utilization. This does not happen in fork mode!

So, the cluster mode is the option to go for performance and the recommended configuration for production.

Nov 02 '19 20:11 averri

Please have a look at: https://github.com/Unitech/pm2/issues/1510

Some people are mentioning about changing the ports of each process created by pm2 cluster mode, which is wrong, we should not change the ports. pm2 is able to share the same port across multiple processes (thanks to Nodejs cluster mode).

I'm still having issues with pm2 cluster mode and feathers-sync.

Nov 02 '19 20:11 averri

@averri I have solved the PM2 + Feathers combination issue by disabling long-polling as a fallback when using socket.io.

Documented here: https://docs.feathersjs.com/cookbook/general/scaling.html#cluster-configuration

The issue is that even if you can somehow configure IP-based sticky sessions in the pm2 cluster, it would only work if you are directly receiving HTTP requests on your server.

As soon as you put your server behind a load balancer (e.g. in an autoscaling config) the IPs of all requests coming to your servers would have the load balancer's IP.

So the only solution for now is to disable long-polling as the feathers docs suggest.

May 03 '20 07:05 asamolion

Does a sticky session solution (let's say running in pm2 cluster mode) account for a single user being authenticated from multiple workstations? For instance a user logs in and is added to a channel users/{userId}, but is there any sort of load balancing solution that could make sure users with that id are assigned to the same node (my assumption is this is not the case). Assuming not your events would only work accurately in the system you initially logged in with and may or may not work randomly in other systems depending on if you happened to be connected to the same node or not.

Assuming the above is all working as I presume, does the redis implementation of feathers-sync successfully address this concern?

TL;DR we run a very reactive app where there is an expectation that all changes are persisted to (n) environments in near realtime. We also run a high number of concurrent connections so the ability to scale the feathers server is highly desirable. Looking to test redis implementation soon.

UPDATE: Just implemented feathers sync on a single dual core debian instance. Running my pm2 cluster has 2 instances (1 per core) and I've verified multiple tabs of my site connect on each of the 2 running instances. Events are all working as I would expect on a single instance setup of feathers/socket.io. So far no gotchas and was very easy to setup with a local redis instance.

Apr 26 '21 21:04 ericuldall

feathers feathers copied to clipboard

Scaling Docs

feathers
feathers copied to clipboard