awesome-yara icon indicating copy to clipboard operation
awesome-yara copied to clipboard

Is there a Yara daemon -- and if not, should there be?

Open GWHAYWOOD opened this issue 2 years ago • 1 comments

OPEN QUESTION:

Off and on I've been looking for a Yara daemon.

I'm processing mail, looking for spam and malware. I write milters to do the processing. The Mail Transfer Agent (MTA) -- Sendmail in my case -- hands the mail (as it arrives, and before delivery to any mailbox) to my milters for on-the-fly processing as appropriate. The milters tell Sendmail if a message is to be accepted, rejected, or whatever. Don't worry about that last one for now...

The milters use Yara to scan the mail for assorted indicators of badness. Like the Sendmail processes which use them, the milters are themselves daemons. Sendmail handles many mail messages simultaneously by forking a copy of itself for each concurrently processed mail message; likewise many milter daemons are forked so that there is one available for each concurrent mail message.

If a milter process decides that it needs to scan a mail message using Yara, then it

(1) writes a file to /tmp/ which contains the message to be scanned, then (2) forks a new process from /usr/bin/yara to scan this file, then (3) waits until the new process terminates, so that it can (4) collect the output of the yara process back into the milter for decision-making etc., and finally (5) deletes the file from /tmp/.

This all works fine but can be a little expensive.

I would like to be able to run a Yara daemon which could serve in place of the tool at /usr/sbin/yara and which my milters could use instead. That way, they wouldn't have to write (nor delete) any files nor create a new process for each scan. I suggest /usr/sbin/yarad might be a good name for the tool.

I envisage connecting to the daemon via a socket. As I'm procesing mail, of course the daemon would need to be multi-threaded, or at least to be able to process multiple connections simultaneously and independently. Ideally [B]for each scan[/] the names of the files which contain the rules to be used, any external Yara variable definitions, and the data to be scanned would all be delivered to the scanner via the socket connection. If it's difficult or expensive to do all that (and I can imagine it might well be an issue) then the daemon could be told when it starts up which rules files to use and perhaps even the values of any external variables. One could run several such daemons, each with a different ruleset and a different set of external variables, which would be less convenient but probably manageable.

Do you know if such a thing exists? After several search attempts I've found nothing which fits the bill.

If not, do you think it would make sense to implement something along these lines?

GWHAYWOOD avatar Aug 06 '22 12:08 GWHAYWOOD

Hello. Didnt hear bout such daemons, but i met with similar problems in my old company and made a few tools based on Yara - mail transport agent for Exchange and Yara library that can process files and archives, which is used by windows service (not open-sourced) for scanning files and archives via Yara. Both of em you can see in this org and if you wish u can use something from this projects for your daemon, If you make such daemon, i think, it'd be wery useful

admiralbenbou avatar Aug 06 '22 18:08 admiralbenbou

Hi, I'm looking for a yara daemon exactly like you describe @GWHAYWOOD, did you find anything ?

Fneufneu avatar May 04 '23 14:05 Fneufneu

Hi Fneufneu,

Sorry no, I never found anything. I only spent a few hours looking so I could have missed something.

It won't be difficult to do but unfortunately it's something like seventh or eighth on my TODO list.

Right now instead of using a daemon I write a file out to a RAM filesystem and scan that. It isn't hideously slow for scanning mail, on a slow machine typically around 60ms to write a mail file of a few tens of kbytes.

I wouldn't want to do it this way for scanning whole filesystems, especially if they contain large files, but I very rarely scan a filesystem.

HTH

Ged.

GWHAYWOOD avatar May 04 '23 15:05 GWHAYWOOD

@GWHAYWOOD @Fneufneu https://docs.clamav.net/manual/Signatures/YaraRules.html

dkorzhevin avatar Jun 15 '23 02:06 dkorzhevin

@dkorzhevin

Thanks for the link. I've used ClamAV for many years.

Unfortunately the Yara implementation in ClamAV is ten years out of date (roughly Yara version 2) and absolutely riddled with faults, some of which crash the ClamAV daemon. A couple of years ago I pointed this out to the ClamAV development people but they didn't seem interested.

As far as I'm aware there's been no change in ClamAV's Yara implementation since that time.

GWHAYWOOD avatar Jun 15 '23 05:06 GWHAYWOOD

Perhaps an issue for https://github.com/VirusTotal/yara

JosiahRaySmith avatar Jun 23 '23 15:06 JosiahRaySmith

@GWHAYWOOD Could you please reach out to me regarding the ClamAV faults and crashes, so I can try to reproduce with latest master?

dkorzhevin AT gmail DOT com

dkorzhevin avatar Jun 27 '23 20:06 dkorzhevin