PHPWord
PHPWord copied to clipboard
TemplateProcessor::cloneBlock() doesn't work with PHP v7.3+
Describe the Bug
It seems the regex used to search the block doesn't work with a template we use, the problem seems to be in relation with the lazy quantifiers, large strings and PHP v7.3 and superior, it works well with PHP v7.2.
Steps to Reproduce
I can't send the template because it's confidential, but :
- When I test lazy quantifiers on simple tests nothing is wrong (PHP v7.3 + v7.4).
- When I test with exact same data minus unuseful tags it works too (PHP v7.3 + v7.4).
- When I test with the full file content there is no match with PHP v7.3+ but a match with v7.2.
Expected Behavior
The blocks have to be cloned...
Current Behavior
...but tags remain visibles.
Context
- PHP Version: 7.3+
- PHPWord Version: 0.17.0
We expirienced similiar issues and dived a bit deeper into it. It seems not to be a PHP-Version problem but rather a problem with the PCRE Version behind. Running on PCRE 10.33 works, PCRE 10.34 fails.
I met other problematic cases when the matches returned by preg_match() is too big (I suppose but it's very probable), I'm a big newbie with the Word file format but I've rewritten this function with what I've seen in my cases, the function to replace is in file /src/PhpWord/TemplateProcessor.php and here is my alternative which uses mb_*str*() functions and works at least in all of my cases :
public function cloneBlock($blockname, $clones = 1, $replace = true, $indexVariables = false, $variableReplacements = null)
{
$idx_tag = mb_strpos($this->tempDocumentMainPart, '${'.$blockname.'}');
if ( $idx_tag === false )
return null;
$idx_start = mb_strrpos(mb_substr($this->tempDocumentMainPart, 0, $idx_tag), '<w:p ' );
$idx_end = mb_strpos( $this->tempDocumentMainPart, '${/'.$blockname.'}', $idx_tag);
if ( $idx_start === false || $idx_end === false )
return null;
$idx_end = mb_strpos($this->tempDocumentMainPart, '</w:p>', $idx_end);
if ( $idx_end === false )
return null;
$idx_end += 6;
$what = mb_substr($this->tempDocumentMainPart, $idx_start, $idx_end - $idx_start);
// --- //
$idx_content_start = mb_strpos($what, 'p>');
$idx_content_end = mb_strrpos($what, '<w:p ');
if ( $idx_content_start === false || $idx_content_end === false )
return null;
$idx_content_start += 2;
$xmlBlock = mb_substr($what, $idx_content_start, $idx_content_end - $idx_content_start);
// --- //
if ( $replace )
{
$by = array();
if ( $indexVariables )
$by = $this->indexClonedVariables($clones, $xmlBlock);
elseif ( $variableReplacements !== null && is_array($variableReplacements) )
$by = $this->replaceClonedVariables($variableReplacements, $xmlBlock);
else
for ( $i = 1 ; $i <= $clones ; $i++ )
$by[] = $xmlBlock;
$by = implode('', $by);
$this->tempDocumentMainPart = str_replace($what, $by, $this->tempDocumentMainPart);
}
return $xmlBlock;
}
@cedrictailly 's function worked for me, thank you 👍
@cedrictailly This solution worked in my case. Thank you
cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.
Increasing pcre.backtrack_limit fixes this.
Adding ini_set("pcre.backtrack_limit", "2000000");
(double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)
PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor
cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.
Increasing pcre.backtrack_limit fixes this.
Adding
ini_set("pcre.backtrack_limit", "2000000");
(double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor
This worked, although had to keep increasing the limit... Might need to also check for a newer version of PHPOffice. But Thanks for the quick fix in my case!
Your code works perfect! Thank you very much cedrictailly
Your code works perfect! Thank you very much cedrictailly
You're welcome 😉
cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.
Increasing pcre.backtrack_limit fixes this.
Adding
ini_set("pcre.backtrack_limit", "2000000");
(double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor
This works fine for me on PHP 5.6+, i added ini_set("pcre.backtrack_limit", "2000000"); at the beggining of cloneBlock function and voilà. p.s. (still had to max out the limit for larger docx)
Got this (after changing from PHP 7.3 to 7.4 I think) and changing the pcre.backtrack_limit did not work so ended up changing the cloneBlock function which does work.
@cedrictailly fantastic!
Love the solution @cedrictailly ! Worked like a charm. Just did a small fix to cater for custom tags:
public function cloneBlock($blockname, $clones = 1, $replace = true, $indexVariables = false, $variableReplacements = null)
{
$escapedMacroOpeningChars = self::$macroOpeningChars;
$escapedMacroClosingChars = self::$macroClosingChars;
$idx_tag = mb_strpos($this->tempDocumentMainPart, $escapedMacroOpeningChars.$blockname.$escapedMacroClosingChars);
if ( $idx_tag === false )
return null;
$idx_start = mb_strrpos(mb_substr($this->tempDocumentMainPart, 0, $idx_tag), '<w:p ' );
$idx_end = mb_strpos( $this->tempDocumentMainPart, $escapedMacroOpeningChars.'/'.$blockname.$escapedMacroClosingChars, $idx_tag);
if ( $idx_start === false || $idx_end === false )
return null;
$idx_end = mb_strpos($this->tempDocumentMainPart, '</w:p>', $idx_end);
if ( $idx_end === false )
return null;
$idx_end += 6;
$what = mb_substr($this->tempDocumentMainPart, $idx_start, $idx_end - $idx_start);
// --- //
$idx_content_start = mb_strpos($what, 'p>');
$idx_content_end = mb_strrpos($what, '<w:p ');
if ( $idx_content_start === false || $idx_content_end === false )
return null;
$idx_content_start += 2;
$xmlBlock = mb_substr($what, $idx_content_start, $idx_content_end - $idx_content_start);
// --- //
if ( $replace )
{
$by = array();
if ( $indexVariables )
$by = $this->indexClonedVariables($clones, $xmlBlock);
elseif ( $variableReplacements !== null && is_array($variableReplacements) )
$by = $this->replaceClonedVariables($variableReplacements, $xmlBlock);
else
for ( $i = 1 ; $i <= $clones ; $i++ )
$by[] = $xmlBlock;
$by = implode('', $by);
$this->tempDocumentMainPart = str_replace($what, $by, $this->tempDocumentMainPart);
}
return $xmlBlock;
}