zlb icon indicating copy to clipboard operation
zlb copied to clipboard

CE 5.12 clients are not being forwarded after all nodes got offline without manually restarting L4XNAT farm

Open jinverso opened this issue 2 years ago • 3 comments

Hi,

Have 2 backends, cutting connections in one backend seems correctly moving users to the other backend (this was not always working on 5.11), after cutting down this second node all clients get an error (which is expected since 2 nodes are down). However, after putting back online either backend, clients are not moved to running node but still getting an error. Only possible workaround is manually resetting the L4XNAT farm.

I've attached support save file in case you need it.

Thank you,

Jorge

supportsave_zevenet50_20220510_1312.tar.gz

jinverso avatar May 10 '22 16:05 jinverso

Hi Jorge, I don't see any farmguardian configured in the EdgeLB farm, for L4xnat farms it is mandatory in case you want to detect server status, please configure a new one copied from check_tcp, in case you have workin with two ports in backend I would recommend something like:

check_tcp -p 80 -H HOST

Also you can develop your own check_tcp_multiport that sends two check_tcp commands, first to port 80 and if it works then sends the same command to port 8080.

Confirm that cut connections is eabled in your farmguardian healch check just to delete any esablished connection and session to the backend in case it fails.

I can confirm that with your current config it works properly, if port 8080 fails in backend1 the farmguardian stops the backend OK, later I stopped backend2 and farmguardian detected the backend down, once both were enabled again the system continued forwarding traffic to the backends without concern

I also confirme that the persistence session table works and is managed properly: map persist-EdgeLB { type ipv4_addr : mark size 65535 timeout 1h elements = { 192.168.10.177 expires 59m51s524ms : 0x80000209 } }

El mar, 10 may 2022 a las 18:41, Jorge Inverso @.***>) escribió:

Hi,

Have 2 backends, cutting connections in one backend seems correctly moving users to the other backend (this was not always working on 5.11), after cutting down this second node all clients get an error (which is expected since 2 nodes are down). However, after putting back online either backend, clients are not moved to running node but still getting an error. Only possible workaround is manually resetting the L4XNAT farm.

I've attached support save file in case you need it.

Thank you,

Jorge

supportsave_zevenet50_20220510_1312.tar.gz https://github.com/zevenet/zlb/files/8663445/supportsave_zevenet50_20220510_1312.tar.gz

— Reply to this email directly, view it on GitHub https://github.com/zevenet/zlb/issues/115, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBQEPCFPUEFDIQRGQ3JXXTVJKGUHANCNFSM5VSGHF4A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Emilio CamposZEVENET Teamwww.zevenet.com

Review ZEVENET Product at Gartner Insights https://gtnr.io/LfXtqnsSr https://www.linkedin.com/company/zevenet https://twitter.com/zevenet https://www.facebook.com/zevenet https://github.com/zevenet [image: ZEVENET] https://www.zevenet.com/signature/

DISCLAIMER: This message contains confidential information and is intended only for the individual named. If you are not the named addressee please notify the sender immediately by email if you have received it by mistake and delete it from your system, you should not disseminate, distribute or copy this email in whole or in part.

emiliocampos-zevenet avatar May 11 '22 08:05 emiliocampos-zevenet

Hi Emilio,

I've set a farmguardian just to check port 8080 as a proof of concept but I'm still getting the same result, after both backend servers are cut for maintenance once any of them are put back online users would still get an error page.

Seems this is exactly same issue reported in this other email thread I've just checked out named "Re: [zevenet-ce-users] Re: l4xnat persistence issue, connection refused even after backends are back up again" you are discussing with Stefan U, aren't you?

I've attached a new supportsave file just for you to check I've already set farmguardian ;-)

Thank you!

El mié, 11 may 2022 a las 5:13, Emilio Campos @.***>) escribió:

Hi Jorge, I don't see any farmguardian configured in the EdgeLB farm, for L4xnat farms it is mandatory in case you want to detect server status, please configure a new one copied from check_tcp, in case you have workin with two ports in backend I would recommend something like:

check_tcp -p 80 -H HOST

Also you can develop your own check_tcp_multiport that sends two check_tcp commands, first to port 80 and if it works then sends the same command to port 8080.

Confirm that cut connections is eabled in your farmguardian healch check just to delete any esablished connection and session to the backend in case it fails.

I can confirm that with your current config it works properly, if port 8080 fails in backend1 the farmguardian stops the backend OK, later I stopped backend2 and farmguardian detected the backend down, once both were enabled again the system continued forwarding traffic to the backends without concern

I also confirme that the persistence session table works and is managed properly: map persist-EdgeLB { type ipv4_addr : mark size 65535 timeout 1h elements = { 192.168.10.177 expires 59m51s524ms : 0x80000209 } }

El mar, 10 may 2022 a las 18:41, Jorge Inverso @.***>) escribió:

Hi,

Have 2 backends, cutting connections in one backend seems correctly moving users to the other backend (this was not always working on 5.11), after cutting down this second node all clients get an error (which is expected since 2 nodes are down). However, after putting back online either backend, clients are not moved to running node but still getting an error. Only possible workaround is manually resetting the L4XNAT farm.

I've attached support save file in case you need it.

Thank you,

Jorge

supportsave_zevenet50_20220510_1312.tar.gz < https://github.com/zevenet/zlb/files/8663445/supportsave_zevenet50_20220510_1312.tar.gz

— Reply to this email directly, view it on GitHub https://github.com/zevenet/zlb/issues/115, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AFBQEPCFPUEFDIQRGQ3JXXTVJKGUHANCNFSM5VSGHF4A

. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Emilio CamposZEVENET Teamwww.zevenet.com

Review ZEVENET Product at Gartner Insights https://gtnr.io/LfXtqnsSr https://www.linkedin.com/company/zevenet https://twitter.com/zevenet https://www.facebook.com/zevenet https://github.com/zevenet [image: ZEVENET] https://www.zevenet.com/signature/

DISCLAIMER: This message contains confidential information and is intended only for the individual named. If you are not the named addressee please notify the sender immediately by email if you have received it by mistake and delete it from your system, you should not disseminate, distribute or copy this email in whole or in part.

— Reply to this email directly, view it on GitHub https://github.com/zevenet/zlb/issues/115#issuecomment-1123329928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKILGJR3F7CNQCWV7IWZXLVJNT25ANCNFSM5VSGHF4A . You are receiving this because you authored the thread.Message ID: @.***>

jinverso avatar Oct 11 '22 09:10 jinverso

Hi Jorge I don't see any supportsave in your latest comment, could you please share supportsave and let us know farm name for our review?

Additionally, please can you confirm you are running ZEVENET 5.12.2 (the latest hotfix)? it seems like the issue you do mention in another ticket with Stefan U was fixed and confirmed by himself.

Thanks!

emiliocampos-zevenet avatar Oct 11 '22 11:10 emiliocampos-zevenet