libexpat icon indicating copy to clipboard operation
libexpat copied to clipboard

Limit on input amplification factor exceeded by non-attack input

Open dlangille opened this issue 1 year ago • 10 comments

This is:

  • expat-2.6.3
  • p5-XML-Parser-2.47
  • perl5-5.38.2

I'm processing the xml files from https://cgit.freebsd.org/ports/tree/security/vuxml

After a recent commit added another xml entry to the collection, I started getting this error:

limit on input amplification factor (from DTD and entities) breached at line 202, column 0, byte 6739
error in processing external entity reference at line 104, column 0, byte 3887 at /usr/local/lib/perl5/site_perl/mach/5.38/XML/Parser.pm line 187.

Reading https://libexpat.github.io/doc/api/latest/#XML_SetBillionLaughsAttackProtectionMaximumAmplification I see 'Note: If you ever need to increase this value for non-attack payload, please file a bug report.' - so here I am.

Steps to reproduce:

  • need p5-XML-DOM-XPath, p5-IO-String
  • mkdir ~/tmp
  • git clone https://git.FreeBSD.org/ports.git (this needs 2.3G sorry)
  • run:
[15:48 empty dvl ~/tmp] % perl process_vuxml2.pl --filename=/usr/home/dvl/tmp/ports/security/vuxml/vuln.xml
process_vuxml.pl starts
(there is usually a delay before further output)

limit on input amplification factor (from DTD and entities) breached at line 202, column 0, byte 6739
error in processing external entity reference at line 104, column 0, byte 3887 at /usr/local/lib/perl5/site_perl/mach/5.38/XML/Parser.pm line 187.

Example script:

[15:48 empty dvl ~/tmp] % cat process_vuxml2.pl
#!/usr/local/bin/perl -w
#
use strict;
use warnings;
use XML::DOM::XPath;
use Getopt::Long;

my $filename;

print 'process_vuxml.pl starts' . "\n";

GetOptions ('filename:s'     => \$filename);

print '(there is usually a delay before further output)' . "\n";

my $parser = new XML::DOM::Parser;
my $doc = $parser->parsefile ($filename);

print "'There, the parsefile has completed\n";

dlangille avatar Nov 23 '24 15:11 dlangille

FYI, I tried a few ways of patching.

This alone did not fix it:

 #define EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_MAXIMUM_AMPLIFICATION_DEFAULT   \
-  100.0f
+  200.0f

Adding this, allowed the processing to proceed.

 #define EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_ACTIVATION_THRESHOLD_DEFAULT    \
-  8388608 // 8 MiB, 2^23
+  16777216 // 8 MiB, 2^23
 /* NOTE END */

Further testing showed the first patch was not relevant.

[17:15 pkg01 dvl /usr/local/poudriere/ports/default/textproc/expat2] % cat files/patch-lib_internal.h
--- lib/internal.h.orig	2024-11-23 17:12:40 UTC
+++ lib/internal.h
@@ -144,7 +144,7 @@
 #define EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_MAXIMUM_AMPLIFICATION_DEFAULT   \
   100.0f
 #define EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_ACTIVATION_THRESHOLD_DEFAULT    \
-  8388608 // 8 MiB, 2^23
+  16777216 // 8 MiB, 2^23
 /* NOTE END */
 
 #include "expat.h" // so we can use type XML_Parser below

dlangille avatar Nov 23 '24 16:11 dlangille

Hi @dlangille,

thanks for the report — very interesting!

Based on shell session…

# cd "$(mktemp -d)"

# git clone --depth 1 https://git.FreeBSD.org/ports.git

# git --git-dir ports/.git rev-parse HEAD
68a559f5b8bb7b46c20b24ee5aea72d3a6249293

# EXPAT_ACCOUNTING_DEBUG=1 xmlwf -x ports/security/vuxml/vuln.xml ; echo $?
expat: Accounting(0x56144ad502a0): Direct       3898, indirect    8384714, amplification  2152.03 ABORTING
ports/security/vuxml/vuln/2003.xml:121:23: limit on input amplification factor (from DTD and entities) breached
ports/security/vuxml/vuln.xml:104:0: error in processing external entity reference
2

# EXPAT_ACCOUNTING_DEBUG=1 xmlwf -x -b $((1024 ** 2 * 9)) ports/security/vuxml/vuln.xml ; echo $?
0

…with Expat 2.6.4 I confirm that for the given input threshold 8 MiB is too low, the file exceeds it by about ~5% and setting a threshold of >=9 MiB works around the issue.

This will need careful consideration.

hartwork avatar Nov 23 '24 21:11 hartwork

@dlangille my impression is that…

  • adjusting the amplification limit from 100 to 2200 would defeat the purpose of the protection (to be explicit about that)
  • because the document is exceeding the amplification factor, we would need to set the threshold to something bigger than (simplified) the sum of all involved documents
  • the file is likely to grow over time and so the threshold would need to move from 8 MiB to say 9 MiB and then to 10 MiB next month or next year and so on — the only variable would be the rate of change
  • so that approach would not have an end and also delay detection of malicious input for everyone as a side effect
  • the setup is special in the sense that it has knowledge that something that would be dangerous with untrusted input — including use of external entities — is safe in your particular context
  • if something is safe in a particular case but not in the general case, one could argue the tooling — the environment — would need unlock that behavior and the default would be to block to stay safe
  • in that view, the Perl tool would need tell the Perl bindings to Expat that the file is considered fully trusted
  • API function XML_SetBillionLaughsAttackProtectionActivationThreshold of Expat >=2.4.0 would be the way to do that
  • but if https://github.com/cpan-authors/XML-Parser is the code behind it, they do not expose that functionality to the user as of today.
  • potentially the fix that works best for the ecosystem is to…
    • extend the API of the Perl bindings to set a custom threshold
    • make process_vuxml2.pl use that bindings
    • update the Perl bindings to recent enough on all systems supposed to run process_vuxml2.pl

What do you think?

hartwork avatar Nov 24 '24 17:11 hartwork

How complex would setting EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_ACTIVATION_THRESHOLD_DEFAULT at run time? It's a constant now.

The reason I ask: if we trust/know the the file, and know that it's of size X, we tell expat, this is good up to size X * 1.5, for example. That would preserve the safety check at the expense of changing expat considerably.

I am also OK with your suggestion. Sounds good to me. For myself, I have a short term fix (patching expat). I am happy to test any changes.

I am willing to open an issue with XML-Parser asking them to expose that functionality.

dlangille avatar Nov 24 '24 17:11 dlangille

How complex would setting EXPAT_BILLION_LAUGHS_ATTACK_PROTECTION_ACTIVATION_THRESHOLD_DEFAULT at run time? It's a constant now.

The reason I ask: if we trust/know the the file, and know that it's of size X, we tell expat, this is good up to size X * 1.5, for example. That would preserve the safety check at the expense of changing expat considerably.

@dlangille API function XML_SetBillionLaughsAttackProtectionActivationThreshold (and sibling XML_SetBillionLaughsAttackProtectionMaximumAmplification) of Expat >=2.4.0 is the official mean to change the value at runtime (even to different values for different parsers depending on — say — the source of parse input). One alternative would be a new environment variable but I'd have a hard time to support a magic side channel when there is stable official API for the same feature that it would also be competing with.

I am also OK with your suggestion. Sounds good to me. For myself, I have a short term fix (patching expat). I am happy to test any changes.

I am willing to open an issue with XML-Parser asking them to expose that functionality.

That would be greatly appreciated :pray:

hartwork avatar Nov 24 '24 23:11 hartwork

Does https://github.com/cpan-authors/XML-Parser/issues/102 read well enough. I'm concerned about expressing the issue correctly after a good feed of lamb rogan josh.

dlangille avatar Nov 25 '24 00:11 dlangille

@dlangille :smile: Thanks! Let's see what we get.

hartwork avatar Nov 25 '24 20:11 hartwork

Should FreeBSD be changing their XML to not trip this, presumably by putting everything in a single file instead of relying on external entities?

DemiMarie avatar Mar 15 '25 18:03 DemiMarie

@DemiMarie my personal favorite would be the FreeBSD Foundation sponsoring a related pull request to Perl XML::Parser: not to me but to someone at home with Perl. XML::Parser needs to expose the related API sooner or later and it would help the whole community not just FreeBSD, while stretching a very FreeBSD-specific itch at the same time. I would love to see that happening. I'm not sure it's possible.

Related:

  • The list of API functions missing can be found at: https://github.com/cpan-authors/XML-Parser/issues/102#issuecomment-2614684325 .
  • Being open to pull requests was voiced at https://github.com/cpan-authors/XML-Parser/issues/102#issuecomment-2568285667 .

CC @dag-erling @emaste

hartwork avatar Mar 15 '25 21:03 hartwork

This is not a request for action. Just FYI.

textproc/expat2 was updated to 2.7.2 yesterday - that started breaking my local builds, because my local patch for this problem no longer cleanly applied. No worries, I regenerated the patch and it builds again.

dlangille avatar Sep 25 '25 12:09 dlangille