ejabberd icon indicating copy to clipboard operation
ejabberd copied to clipboard

Ejabberd is not able to serve the appropriate cert

Open lokesh411 opened this issue 2 years ago • 13 comments

Environment

  • ejabberd version: 22.05
  • Erlang version: 24
  • OS: Linux (Debian)
  • Installed from: official deb/rpm

Configuration:

certfiles:
 - abc.com # given by godaddy
 - xyz.com # given by lets encrypt for *.xyz.com

Bug description

Ejabberd is not able to serve the appropriate cert if multiple certs. If I am making a request to ejabberd with r0.xyz.com as the SNI, then it serves cert given toabc.com

lokesh411 avatar Oct 17 '23 18:10 lokesh411

ejabberd.log has any info?

You made sure ejabberd can access the cert files?

licaon-kter avatar Oct 17 '23 19:10 licaon-kter

Yeah ejabberd is the owner of the certificates Also, i don't find anything wierd in the logs

On Wed, 18 Oct, 2023, 12:33 am Licaon_Kter, @.***> wrote:

ejabberd.log has any info?

You made sure ejabberd can access the cert files?

— Reply to this email directly, view it on GitHub https://github.com/processone/ejabberd/issues/4102#issuecomment-1766996471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGLDOPD6WHDYB3I7OXVR5P3X73JABAVCNFSM6AAAAAA6EKBC6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWHE4TMNBXGE . You are receiving this because you authored the thread.Message ID: @.***>

lokesh411 avatar Oct 17 '23 19:10 lokesh411

May be it's related to this issue, so - i have the same problem. Debian (10-12), ejabberd (few versions from Debian repo, latest - 24.02 - official deb; so - at least all 2* is affected); 3 domains (JW is the first domain in hosts). Nothing in logs; usually it's happened after certificates update (permissions are ok) - just after ejabberdctl reload_config SOMETIMES sending wrong certificate for SOME domain and SOME port. Usually i can fix it with one more reload_config. Just for example (same certs, nothing changed in config file, just launched ejabberdctl reload_config ~10 times):

C2S TLS OK:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name

S2S TLS WRONG:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = jabberworld.info
jabber.name: subject=CN = jabberworld.info
  • First reload_config - same result:
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = jabberworld.info
jabber.name: subject=CN = jabberworld.info
  • Second - now all ok:
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
  • 3-10 reload_config - all ok.
  • Ok, let's change something. Added a space to /etc/ejabberd/certs/jabber.name.fullchain.pem in between lines:
-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----

Permissions are ok:

ls -l /etc/ejabberd/certs/jabber.name.fullchain.pem 
-rw-r----- 1 root ejabberd 5852 Mar 24 20:16 /etc/ejabberd/certs/jabber.name.fullchain.pem

Tried reload_config 7 times - all ok.

  • Ok, let's change something else. Added a "#" to config file (to already commented line at the beginning of file - so now it's just a '##'). 1st reload_config - all ok (!) 2nd - got same problem again:
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = jabberworld.info
jabber.name: subject=CN = jabberworld.info

3..7 - all ok.

  • Removed that comment. 1st reload_config - got a problem. 2nd - ok 3rd - ok ...

EOUpSL93Y avatar Mar 24 '24 20:03 EOUpSL93Y

@EOUpSL93Y are you the admin of linuxoid.in ?

licaon-kter avatar Mar 24 '24 21:03 licaon-kter

Yes

EOUpSL93Y avatar Mar 24 '24 21:03 EOUpSL93Y

Next time you run into the issue, could you check whether running the following call in an ejabberdctl debug shell fixes the issue:

fast_tls:clear_cache().

(Including the trailing .. Press Ctrl+g and then q to exit the shell.)

As an alternative, you could use a script such as the following (if escript is in the path, and assuming the node name is ejabberd@localhost):

#!/usr/bin/env escript
%%! -sname fix-cert@localhost

-define(NODE, 'ejabberd@localhost').

-spec main([string()]) -> any().
main(_Args) ->
  try ok = erpc:call(?NODE, fast_tls, clear_cache, [])
  catch error:{erpc, Reason} ->
      io:fwrite(standard_error, "Cannot query ejabberd: ~p~n", [Reason]),
      halt(1)
  end.

weiss avatar Aug 02 '24 12:08 weiss

Maybe having a clear cache ejabberdctl command would be handy ? It could a hook that module could register to. It could take a specific parameter to only clear the cache of a single module.

@weiss Is this a good idea ?

mremond avatar Aug 05 '24 08:08 mremond

We already do that in reload_config, so probably no need to add separate command just for that (and we call that clear cache function from reload_config). Now the question is why there are sometimes stale results...

prefiks avatar Aug 05 '24 08:08 prefiks

The problem being related to fast_tlss caching was just a blind guess of mine. I suggested that test to make sure nothing else is reloaded, just to track things down.

weiss avatar Aug 06 '24 07:08 weiss

Got this problem again just after upgrade to 24.07, then tried

fast_tls:clear_cache().

and it fixed the problem!

EOUpSL93Y avatar Aug 11 '24 22:08 EOUpSL93Y