zlb icon indicating copy to clipboard operation
zlb copied to clipboard

Zevenet seems to forget Session after a while (doesnt fit with configured timeouts)

Open mschnetz opened this issue 3 years ago • 7 comments

Hi everyone, I am using Zevenet CE 5.10.1 for a project. Its a pretty basic setup - one LB Cluster with two backends per farm and multiple farms. On a few farms (with identical configuration except the port used for listening and the backends) it seems I am losing my sessions. Persistance is held up by HTTP Header. One session usually takes a few minutes with around 10-15 messages being exchanged between clients and servers. Now the persistance works, when messages are regularly exchanged, but if there is no communication from the client to the server for about 75s, the next message will be sent to the next machine in the round robin queue - if its the same as before, everything is fine, if its the other, the session dies on me.

Farm Config is like this: `LogLevel 3

check timeouts:

Timeout 45 ConnTO 20 Alive 10 Client 30 ThreadModel dynamic Control "/tmp/SLOT5030_proxy.socket" #DHParams "/usr/local/zevenet/app/zproxy/etc/dh2048.pem" ##ECDHCurve "prime256v1"

#HTTP(S) LISTENERS ListenHTTP Err414 "/usr/local/zevenet/config/SLOT5030_Err414.html" Err500 "/usr/local/zevenet/config/SLOT5030_Err500.html" Err501 "/usr/local/zevenet/config/SLOT5030_Err501.html" Err503 "/usr/local/zevenet/config/SLOT5030_Err503.html" Address 10.54.24.222 Port 5030 xHTTP 4 RewriteLocation 1

    #Cert "/usr/local/zevenet/config/zencert.pem"
    #Ciphers "ALL"
    #Disable SSLv3
    #SSLHonorCipherOrder 1
    #ZWACL-INI

    Service "SLOT5030BACKEND"
            ##False##HTTPS-backend##
            #DynScale 1
            #BackendCookie "ZENSESSIONID" "domainname.com" "/" 0
            #HeadRequire "Host: "
            #Url ""
            #Redirect ""
            #StrictTransportSecurity 21600000
            Session
                    Type HEADER
                    TTL 3600
                    ID "BL-Session-ID"
            End
            #BackEnd

            BackEnd
                    Address 10.54.24.31
                    Port 5030
            End

` I thought 75s sounds like 30+45, but changing either the client request timeout (to 300s) or the server response timeout (to 450s, even though I dont believe that should affect my issue at all) changed anything about the behaviour. Something else i noticed is on the monitoring page: image

There are never any sessions found. By looking at some other threads here, I thought my sessions should pop up on that page. However, if the persistance doesnt work at all, I would expect nearly all my sessions to fail everytime at some point. So now im a bit confused by the whole thing.. Some more information: Second node of the cluster is turned off atm for my testing. `root@hnd-mgt-lb-21:/usr/local/zevenet/config# dpkg -l | grep zproxy ii zproxy 0.1.61-5.11.0 amd64 Zevenet zproxy

root@hnd-mgt-lb-21:/usr/local/zevenet/config# dpkg -l | grep zevenet ii buster-apt-key 1.0 amd64 Add public key and path of the zevenet-buster repository ii zevenet 5.10.1 amd64 Zevenet Load Balancer Community Edition ii zevenet-ce-cluster 1.3 amd64 Zevenet Load Balancer Community Edition Cluster Service ii zevenet-gui-ce 2.1.2-5.10.0 all Web GUI of Zevenet Community 5.10

[master] root@hnd-mgt-lb-21:/usr/local/zevenet/config# dpkg -l | grep pound ii pound 2.7-1.3 amd64 reverse proxy, load balancer and HTTPS front-end for Web servers

[master] root@hnd-mgt-lb-21:/usr/local/zevenet/config# ps -ef | grep zproxy root 1096 1 0 Sep23 ? 00:02:40 /usr/local/zevenet/app/zproxy/bin/zproxy -f /usr/local/zevenet/config/HTTPS_proxy.cfg -p /var/run/HTTPS_proxy.pid root 1147 1 0 Sep23 ? 00:00:38 /usr/local/zevenet/app/zproxy/bin/zproxy -f /usr/local/zevenet/config/SLOT5020_proxy.cfg -p /var/run/SLOT5020_proxy.pid root 18733 1 0 20:11 ? 00:00:00 /usr/local/zevenet/app/zproxy/bin/zproxy -f /usr/local/zevenet/config/SLOT5030_proxy.cfg -p /var/run/SLOT5030_proxy.pid root 19424 18205 0 20:21 pts/0 00:00:00 grep zproxy ` We are running HTTP. This is the relevant header (copied from tcdpdump): BL-Session-ID: xyz Hope you have any ideas!

mschnetz avatar Sep 25 '20 11:09 mschnetz

Hi!

Could you update to the latest versions and check if the issue persists, if it does please send a supportsave file to [email protected] int to check and test your configuration.

abdessamad-zevenet avatar Sep 25 '20 13:09 abdessamad-zevenet

Hi, I updated the loadbalancer in our lab and could reproduce the issue the same way. I sent a supportfile to [email protected] with subject identical to this topic. Hope it helps!

mschnetz avatar Sep 28 '20 10:09 mschnetz

Hi Markus,

I found the issue and it's already fixed for the next release. meanwhile, I will send you by email the new version to test if it works for you correctly.

Thank you for reporting!

abdessamad-zevenet avatar Oct 05 '20 07:10 abdessamad-zevenet

Hi Abdessamad, thank you for you support! After a bunch of testing I concluded, that the beta-update did not help with forgetting the sessions. The behaviour seems identical to before the beta-update - after around 75s the session is done, no matter any timeout configurations. The .deb seems to be correctly installed: ` root@qa-sbd-lb-1:~# dpkg -l | grep zproxy ii zproxy 0.2.3-beta2 amd64 ZEVENET zproxy

[master] root@qa-sbd-lb-1:~# ps -ef | grep zproxy root 8774 1 0 12:37 ? 00:00:00 /usr/local/zevenet/app/zproxy/bin/zproxy -f /usr/local/zevenet/config/slot.cfg -p /var/run/slot.pid root 8994 8922 0 12:40 pts/0 00:00:00 grep zproxy `

Maybe I just didnt understand the timeout configurations from the documentation.. so Ill just rephrase in my own words: Client request timeout [s]: If the client doesnt send any requests the LB can forward within X amount of seconds, session persistance data will be deleted. Backend response timeout [s]: If the backend doesnt respond to a forwarded request within X amount of seconds, an error will be sent to the client. Persistence session time to live [s]: No matter if any requests and responses have been sent, after X amount of seconds, session persistance data will be deleted.

These are from what I understand the only parameters that should affect the session. Please let me know if that is not the case. If it is the case however, and I understand these parameters correctly, I am afraid neither "Client request timeout" nor "Persistance session time to live" affect the process at all.

mschnetz avatar Oct 05 '20 11:10 mschnetz

The timeout in the configuration works as follow, I'm using this data from your suportsave file: . Backend response timeout, if no response from backend in 45 seconds, send an error to the client and close the connection Timeout 45 . Backend connection timeout, set backend down if connection takes longer than 20 seconds ConnTO 20 . check down backends every 10 seconds, also used for flushing expired sessions Alive 10 . After the client connection to the proxy, if no request is made in 30 seconds close the connection. Client 30

Session Type HEADER TTL 120 . the session is valid for 2 minutes, the counter is reset with every request client sends. ID "BL-Session-ID" End

So, when the client sends the first request a session is created in the session table of the service, if he doesn't send a request before TTL time, the session is lost. the other timeouts don't affect the session information unless the backend is considered down, then all its sessions get deleted.

For testing if this is working as expected, open two ssh connection to the load balancer,

  • in the first one, monitor the session table using zproxy API, for your configuration: watch -n 0.1 ' curl -s --unix-socket /tmp/HND-NH-App-Server_proxy.socket http://localhost/listener/0/service/0/sessions | python -m json.tool'

  • on the other terminal, make a CURL request to servcie, then you should see the new entry, wait and count until that entry disappears, it should be TTL time as minimum and TTL + alive time as maximum.

Do the test, and make sure have stopped and started the farm after installing the package provided.

if the issue persists, please enable the debug level in the farm configuration file (LogLevel 8) and redo the test above, then create and send me the a supportsave file.

abdessamad-zevenet avatar Oct 05 '20 12:10 abdessamad-zevenet

Ok, wow, that socket watching is insanely helpful, thank you! I tested and the TTL works as intended, the session information is deleted after the configured amount of seconds. So the session is indeed not forgotten, but still after 75s the request is forwarded to the backend not assigned to the session. I will now enable LogLevel 8 and send another support save. Thank you!

mschnetz avatar Oct 05 '20 12:10 mschnetz

You can learn more about zproxy in its own repository https://github.com/zevenet/zproxy, also check its man page.

I the session is present during TTL time then it should redirect to the assigned backend.

Check for what happens at exactly 75 seconds, after enabling the debug log, check backend disconnects, session error messages.

abdessamad-zevenet avatar Oct 05 '20 12:10 abdessamad-zevenet