crowdsecurity/ssh-bf creates ban eventhough failed logins are much further apart than leekspeed
Describe the bug crowdsecurity/ssh-bf created a ban for a user whose failed login attempts are several minutes or even hours apart.
Moreover, the ban was added to crowdsec almost 2 days after the actual log entries/failed sign in attempts happened See the attached crowdsec alert inspect output.
To Reproduce Not sure how to reproduce this, could be a glitch, as the ban happened on
- Date : 2025-06-01T07:01:25Z (as per the cscli alerts inspect output) yet the alert causing log entries happened on 05/29 and 05/30 !! It would appear that some event at that time caused crowdsec to re-parse the logs and the ssh-bf component failed to actually verify the time stamps of the log entries?
Expected behavior According to ssh-bf leakspeed is 10sec and bucket size is 5, so bans should only happen if someone tries to log in more than 5 times within 10 sec? ssh-bf should correctly verify the log timestamps and only create a ban if the capacity of > 5 is WITHIN the leakspeed of 10sec
Note the timestamps between events in below "alerts inspect output" event 1 to event 2: delta T ~ 14min event 2 to event 3: delta T ~16min event 3 to event 4: delta T = 52sec event 4 to event 5: delta T ~ 3.5min event 5 to event 6: delta T ~ 13h !!!
Crowdsec alerts inspect output
$ sudo docker exec -it crowdsec cscli alerts inspect -d 1332
- ID : 1332
- Date : 2025-06-01T07:01:25Z
- Machine : xxxxxx
- Simulation : false
- Remediation : true
- Reason : crowdsecurity/ssh-bf
- Events Count : 6
- Scope:Value : Ip:47.xx.xx.xx
- Country : US
- AS : CHARTER-20115
- Begin : 2025-06-01 07:00:18.274679609 +0000 UTC
- End : 2025-06-01 07:00:20.975904131 +0000 UTC
- UUID : 6a4c2296-d5ce-4c6e-87a7-6ba6564a3485
- Context :
?----------------------?
¦ Key ¦ Value ¦
+-------------+--------¦
¦ target_user ¦ xxxxxx ¦
?----------------------?
- Events :
- Date: 2025-05-29 20:15:27.186165 -0700 -0700
?----------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+----------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-29T20:15:27.186165-07:00 ¦
?----------------------------------------------------?
- Date: 2025-05-29 20:29:17.53547 -0700 -0700
?---------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+---------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-29T20:29:17.53547-07:00 ¦
?---------------------------------------------------?
- Date: 2025-05-29 20:45:43.917619 -0700 -0700
?----------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+----------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-29T20:45:43.917619-07:00 ¦
?----------------------------------------------------?
- Date: 2025-05-29 20:46:39.960272 -0700 -0700
?----------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+----------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-29T20:46:39.960272-07:00 ¦
?----------------------------------------------------?
- Date: 2025-05-29 20:50:12.634747 -0700 -0700
?----------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+----------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-29T20:50:12.634747-07:00 ¦
?----------------------------------------------------?
- Date: 2025-05-30 09:50:45.130979 -0700 -0700
?----------------------------------------------------?
¦ Key ¦ Value ¦
+-----------------+----------------------------------¦
¦ ASNNumber ¦ 20115 ¦
¦ ASNOrg ¦ CHARTER-20115 ¦
¦ IsInEU ¦ false ¦
¦ IsoCode ¦ US ¦
¦ SourceRange ¦ 47.xx.xx.0/19 ¦
¦ datasource_path ¦ /var/log/auth.log ¦
¦ datasource_type ¦ file ¦
¦ log_type ¦ ssh_failed-auth ¦
¦ machine ¦ xxxx ¦
¦ service ¦ ssh ¦
¦ source_ip ¦ 47.xx.xx.xx ¦
¦ target_user ¦ xxxxxx ¦
¦ timestamp ¦ 2025-05-30T09:50:45.130979-07:00 ¦
?----------------------------------------------------?
Additional context Log entries in /var/log/auth.log are flagged with [preauth] meaning connection was closed after starting login process, most likely because wrong password was entered by user for the SSH-key (we use key authentication)
$ sudo grep preauth /var/log/auth.log.1 |grep 47.xx.xx.xx
2025-05-29T20:15:27.186165-07:00 xx sshd[89933]: Connection closed by authenticating user xx 47.xx.xx.xx port 49870 [preauth]
2025-05-29T20:29:17.535470-07:00 xx sshd[93956]: Connection closed by authenticating user xx 47.xx.xx.xx port 34956 [preauth]
2025-05-29T20:45:43.917619-07:00 xx sshd[98666]: Connection closed by authenticating user xx 47.xx.xx.xx port 43098 [preauth]
2025-05-29T20:46:39.960272-07:00 xx sshd[98910]: Connection closed by authenticating user xx 47.xx.xx.xx port 54628 [preauth]
2025-05-29T20:50:12.634747-07:00 xx sshd[99977]: Connection closed by authenticating user xx 47.xx.xx.xx port 40266 [preauth]
2025-05-30T09:50:45.130979-07:00 xx sshd[322909]: Connection closed by authenticating user xx 47.xx.xx.xx port 58162 [preauth]
And here is the crowdsec log entry for that specific event:
time="2025-06-01T07:01:23Z" level=info msg="Ip 47.xx.xxxx performed 'crowdsecurity/ssh-bf' (6 events over 2.701224522s) at 2025-06-01 07:00:20.97590354 +0000 UTC"
Clearly there is a disconnect between what crowdsec/ssh-bf thinks is the event timeline vs when those events actually happened.
According to ssh-bf leakspeed is 10sec and bucket size is 5, so bans should only happen if someone tries to log in more than 5 times within 10 sec?
No, its a leaky bucket technique, here a diagram if my words dont give it justice but basically whenever an event is poured into the bucket the counter goes up, then each 10 seconds the counter decreases by 1 until it either overflows or underflows (less than 1 event in bucket).
It is meant to detect mostly bots as they will login really fast compared to a typical user.
I can see from your grep you are looking for preauth any chance you could just look for the IP itself just so we can get a full picture in case a log line may got counter twice for example as a preauth and also a failed login for example.
Hello,
Was there a log rotation or something else around the time the decision was taken ?
The only thing I can think of right now is an issue during the log rotation:
- Crowdsec detected the rotation and started to read the log from the start (when crowdsec starts, the file is read from the end, but when a rotation is detected, we read from the start to make sure we do not miss anything)
- Somehow, the rotation just wrote the same content to the file, leading to crowdsec thinking all the events happened at once.
When processing logs in real-time, crowdsec will ignore the timestamp by default because unfortunately, not all logs have timestamp and when reading from a file, you can fairly safely assume the time at which the log was written was the time at which it was processed (you can override this behavior with the use_time_machine option, but it should never be needed in your case): this can lead to some weird behaviors if you write a bunch of logs at the same time but that were originally from a longer period.
No, its a leaky bucket technique, here a diagram if my words dont give it justice but basically whenever an event is poured into the bucket the counter goes up, then each 10 seconds the counter decreases by 1 until it either overflows or underflows (less than 1 event in bucket).
Thanks you very much for the prompt response and for the explanation, I appreciate it. Learned something new - makes perfect sense.
I can see from your grep you are looking for
preauthany chance you could just look for the IP itself just so we can get a full picture in case a log line may got counter twice for example as a preauth and also a failed login for example.
As this is our own IP there are many,many successful logins in the log and I did not want to flood this thread with lots of irrelevant log entries. However I went through them all and the ones I have listed are the ones that caused the bucket to overflow, which can be seen by comparing their timestamps with those timestamps that are listed for cscli alerts inspect output.
I think Blotus hit the nail on the head, that the issue was caused by a logrotate that happened at this time, see my response to his post.
Hello,
Was there a log rotation or something else around the time the decision was taken ?
First, thank you for your response. Yes this is exactly what happenend. However, see further below....
* Somehow, the rotation just wrote the same content to the file, leading to crowdsec thinking all the events happened at once.
No it did not - logrotate copied the old auth.log to auth.log.1 and then created a new, empty auth.log (as it is configured to do) As you can see from my grep output above the offending events that caused the ban where in auth.log.1 (I also searched the new auth.log and it had no entries for our IP)
What does not make sense in that scenario is that our acquis.yaml file for those logs states:
filenames:
- /var/log/auth.log
- /var/log/syslog
labels:
type: syslog
So crowdsec should not have re-parsed the auth.log.1 file?? So I think we still have a potential issue here?
When processing logs in real-time, crowdsec will ignore the timestamp by default because unfortunately, not all logs have timestamp and when reading from a file, you can fairly safely assume the time at which the log was written was the time at which it was processed (you can override this behavior with the
use_time_machineoption, but it should never be needed in your case): this can lead to some weird behaviors if you write a bunch of logs at the same time but that were originally from a longer period.
Ok thanks, this makes perfect sense, I was not aware however, that this is the default behavior. Thanks for the pointer on how to override this. As a suggestion: This is one of the gotchas, were a more prominent mentioning of what the default behavior is and how to override it could prevent a headache or two (and lots of wasted time on the user end).
Again thank you all for your replies and help.