clamav
clamav copied to clipboard
PCRE subsignature does not respect escaped /
Describe the bug
According to the documentation:
PCRE must be delimited by ’/’ and usage of ’/’ within the expression need to be escaped.
However, escaped '/' still terminate the REGEX and result in a broken subsignature.
How to reproduce the problem
test.ldb: TestSig.1;Engine:81-255,Target:0;1>1;3c3f706870::i;0/var \$\w+[\s\=]+['"][\s\w\/\+=]+['"]\x3B/si
Signature is designed to detect PHP open tag and 2 or more variables defined as strings of base64 text in quotes with a semi-colon.
sigtool output:
VIRUS NAME: TestSig.1
TDB: Engine:81-255,Target:0
LOGICAL EXPRESSION: 1>1
* SUBSIG ID 0
+-> OFFSET: ANY
+-> SIGMOD: NOCASE
+-> DECODED SUBSIGNATURE:
<?php
* SUBSIG ID 1
+-> OFFSET: ANY
+-> SIGMOD: NONE
+-> DECODED SUBSIGNATURE:
+-> TRIGGER: 0
+-> REGEX: var \$\w+[\s\=]+['"][\s\w\
+-> CFLAGS: \+=]+['"]\x3B/si
testfile
<?php
var $foo = "f4n8hvwcgidqqcohvdgsgwuwb4af4bfdcbk3y+jcrtjai5pja7xhj48x/56vk67koccswje9jqetat5s";
var $bar = "qwtsqy3fkubx/okyxq7zfpqyd+5tifhw7qxojipkxrp+xriisllgt2vydw61wdc97tpb+bmctwvnzayj";
Clam Config
Checking configuration files in /opt/clamav/etc
Config file: clamd.conf
LogFile = "/var/log//clamd.log" LogTime = "yes" PidFile = "/var/run/clamd/clamd.pid" TemporaryDirectory = "/var/tmp/av/clamav" DatabaseDirectory = "/opt/clamav/var/lib/clamav" LocalSocket = "/tmp/clamd.socket" TCPAddr = "127.0.0.1" MaxThreads = "20" User = "clamav" PhishingScanURLs disabled AlgorithmicDetection disabled
Config file: freshclam.conf
LogTime = "yes" DatabaseDirectory = "/opt/clamav/var/lib/clamav" UpdateLogFile = "/var/log/freshclam.log" DatabaseMirror = "db.US.clamav.net", "database.clamav.net" *** AllowSupplementaryGroups is DEPRECATED ***
clamav-milter.conf not found
Software settings
Version: 0.103.3 Optional features supported: MEMPOOL IPv6 BIGSTACK AUTOIT_EA06 BZIP2 LIBXML2 PCRE2 ICONV JSON
Database information
Database directory: /opt/clamav/var/lib/clamav daily.cld: version 26343, sigs: 1941807, built on Thu Nov 4 03:22:31 2021 bytecode.cld: version 333, sigs: 92, built on Mon Mar 8 09:21:51 2021 main.cld: version 62, sigs: 6647427, built on Thu Sep 16 07:32:42 2021 ... Platform information
uname: Linux 4.14.129-grsec #1 SMP Mon Jun 24 20:37:52 CDT 2019 x86_64 OS: linux-gnu, ARCH: x86_64, CPU: x86_64 Full OS version: Debian GNU/Linux 8.11 (jessie) zlib version: 1.2.8 (1.2.8), compile flags: a9 platform id: 0x0a217c7c0800000000040902
Build information
GNU C: 4.9.2 (4.9.2) CPPFLAGS: CFLAGS: -g -O2 -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 CXXFLAGS: -g -O2 LDFLAGS: Configure: '--prefix=/opt/clamav/clamav' '--datarootdir=/opt/clamav/data' '--with-dbdir=/opt/clamav/var' '--with-dbdir=/opt/clamav/share' '--sysconfdir=/opt/clamav/etc' '--disable-clamonacc' '--disable-unrar' '--enable-bzip2' '--enable-xml' '--with-pcre=/opt/pcre2' '--enable-bigstack' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig' sizeof(void*) = 8 Engine flevel: 124, dconf: 124
I don't think the problem is what you've described. I tried some variations to identify the issue.
I confirmed that if you change the logical expression from 1>1
to 1>0
, then:
test.ldb: TestSig.1;Engine:81-255,Target:0;1>0;3c3f706870::i;0/var \$\w+[\s\=]+['"][\s\w\/\+=]+['"]\x3B/si
will match on:
<?php
var $foo = "f4n8hvwcgidqqcohvdgsgwuwb4af4bfdcbk3y+jcrtjai5pja7xhj48x/56vk67koccswje9jqetat5s";
and on:
<?php
var $foo = "f4n8hvwcgidqqcohvdgsgwuwb4af4bfdcbk3y+jcrtjai5pja7xhj48x/56vk67koccswje9jqetat5s";
I tested with removing the \
-escape on the \/
and found that
test.ldb: TestSig.1;Engine:81-255,Target:0;1>0;3c3f706870::i;0/var \$\w+[\s\=]+['"][\s\w/\+=]+['"]\x3B/si
will also match.
I think the issue is that PCRE subsignatures stop scanning after the first match and so aren't recording multiple matches.
Found this while thinking about reporting a bug.
The bug I believe is in the output to sigtool --decode-sigs
* SUBSIG ID 1 +-> OFFSET: ANY +-> SIGMOD: NONE +-> DECODED SUBSIGNATURE: +-> TRIGGER: 0 +-> REGEX: var \$\w+[\s\=]+['"][\s\w\ +-> CFLAGS: \+=]+['"]\x3B/si
Should be:
* SUBSIG ID 1 +-> OFFSET: ANY +-> SIGMOD: NONE +-> DECODED SUBSIGNATURE: +-> TRIGGER: 0 +-> REGEX: var \$\w+[\s\=]+['"][\s\w\/\+=]+['"]\x3B +-> CFLAGS: si
You are splitting on the wrong forward slash (solidus) this is confusing.