PHPWord icon indicating copy to clipboard operation
PHPWord copied to clipboard

TemplateProcessor::cloneBlock() doesn't work with PHP v7.3+

Open cedrictailly opened this issue 4 years ago • 12 comments

Describe the Bug

It seems the regex used to search the block doesn't work with a template we use, the problem seems to be in relation with the lazy quantifiers, large strings and PHP v7.3 and superior, it works well with PHP v7.2.

Steps to Reproduce

I can't send the template because it's confidential, but :

  • When I test lazy quantifiers on simple tests nothing is wrong (PHP v7.3 + v7.4).
  • When I test with exact same data minus unuseful tags it works too (PHP v7.3 + v7.4).
  • When I test with the full file content there is no match with PHP v7.3+ but a match with v7.2.

Expected Behavior

The blocks have to be cloned...

Current Behavior

...but tags remain visibles.

Context

  • PHP Version: 7.3+
  • PHPWord Version: 0.17.0

cedrictailly avatar Mar 10 '20 19:03 cedrictailly

We expirienced similiar issues and dived a bit deeper into it. It seems not to be a PHP-Version problem but rather a problem with the PCRE Version behind. Running on PCRE 10.33 works, PCRE 10.34 fails.

cHeeSaW avatar Apr 22 '20 05:04 cHeeSaW

I met other problematic cases when the matches returned by preg_match() is too big (I suppose but it's very probable), I'm a big newbie with the Word file format but I've rewritten this function with what I've seen in my cases, the function to replace is in file /src/PhpWord/TemplateProcessor.php and here is my alternative which uses mb_*str*() functions and works at least in all of my cases :

public function cloneBlock($blockname, $clones = 1, $replace = true, $indexVariables = false, $variableReplacements = null)
{
    $idx_tag = mb_strpos($this->tempDocumentMainPart, '${'.$blockname.'}');

    if ( $idx_tag === false )
      return null;

    $idx_start = mb_strrpos(mb_substr($this->tempDocumentMainPart, 0, $idx_tag), '<w:p '                       );
    $idx_end   =  mb_strpos(          $this->tempDocumentMainPart,               '${/'.$blockname.'}', $idx_tag);


    if ( $idx_start === false || $idx_end === false )
      return null;

    $idx_end = mb_strpos($this->tempDocumentMainPart, '</w:p>', $idx_end);

    if ( $idx_end === false )
      return null;

    $idx_end += 6;

    $what = mb_substr($this->tempDocumentMainPart, $idx_start, $idx_end - $idx_start);

    // --- //

    $idx_content_start =   mb_strpos($what, 'p>');
    $idx_content_end   =  mb_strrpos($what, '<w:p ');

    if ( $idx_content_start === false || $idx_content_end === false )
      return null;

    $idx_content_start += 2;

    $xmlBlock = mb_substr($what, $idx_content_start, $idx_content_end - $idx_content_start);

    // --- //

    if ( $replace )
    {
        $by = array();

        if ( $indexVariables )
            $by = $this->indexClonedVariables($clones, $xmlBlock);
        elseif ( $variableReplacements !== null && is_array($variableReplacements) )
            $by = $this->replaceClonedVariables($variableReplacements, $xmlBlock);
        else
            for ( $i = 1 ; $i <= $clones ; $i++ )
                $by[] = $xmlBlock;

        $by = implode('', $by);

        $this->tempDocumentMainPart = str_replace($what, $by, $this->tempDocumentMainPart);
    }

    return $xmlBlock;
}

cedrictailly avatar Apr 24 '20 15:04 cedrictailly

@cedrictailly 's function worked for me, thank you 👍

burhanudinr avatar Aug 24 '20 02:08 burhanudinr

@cedrictailly This solution worked in my case. Thank you

Stender89 avatar Sep 14 '20 12:09 Stender89

cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.

Increasing pcre.backtrack_limit fixes this.

Adding ini_set("pcre.backtrack_limit", "2000000"); (double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)

PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor

lucasnetau avatar Oct 01 '20 06:10 lucasnetau

cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.

Increasing pcre.backtrack_limit fixes this.

Adding ini_set("pcre.backtrack_limit", "2000000"); (double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)

PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor

This worked, although had to keep increasing the limit... Might need to also check for a newer version of PHPOffice. But Thanks for the quick fix in my case!

Klownicle avatar Feb 06 '21 04:02 Klownicle

Your code works perfect! Thank you very much cedrictailly

Sergi17138 avatar Sep 22 '21 06:09 Sergi17138

Your code works perfect! Thank you very much cedrictailly

You're welcome 😉

cedrictailly avatar Sep 22 '21 08:09 cedrictailly

cloneBlock and other functions using preg_match ignore the return result from preg_match and therefore any errors. Checking preg_last_error after a failed cloneBlock shows the PREG_BACKTRACK_LIMIT_ERROR error.

Increasing pcre.backtrack_limit fixes this.

Adding ini_set("pcre.backtrack_limit", "2000000"); (double the default) to your script prior to calling TemplateProcessor::cloneBlock() works for me, for some large documents you may need to increase it further or set the limit to unlimited (-1)

PR [#1354] does look to be the cleanest replacement using xml parsing rather than regular expressions, but in the meantime increasing the backtrack limit does save you from needing to keep a patched TemplateProcessor

This works fine for me on PHP 5.6+, i added ini_set("pcre.backtrack_limit", "2000000"); at the beggining of cloneBlock function and voilà. p.s. (still had to max out the limit for larger docx)

glcprm91 avatar Oct 14 '21 10:10 glcprm91

Got this (after changing from PHP 7.3 to 7.4 I think) and changing the pcre.backtrack_limit did not work so ended up changing the cloneBlock function which does work.

web-assistant avatar Dec 21 '21 15:12 web-assistant

@cedrictailly fantastic!

crazywhalecc avatar Nov 08 '23 07:11 crazywhalecc

Love the solution @cedrictailly ! Worked like a charm. Just did a small fix to cater for custom tags:

public function cloneBlock($blockname, $clones = 1, $replace = true, $indexVariables = false, $variableReplacements = null)
{
    $escapedMacroOpeningChars = self::$macroOpeningChars;
    $escapedMacroClosingChars = self::$macroClosingChars;

    $idx_tag = mb_strpos($this->tempDocumentMainPart, $escapedMacroOpeningChars.$blockname.$escapedMacroClosingChars);

    if ( $idx_tag === false )
      return null;

    $idx_start = mb_strrpos(mb_substr($this->tempDocumentMainPart, 0, $idx_tag), '<w:p '                       );
    $idx_end   =  mb_strpos(          $this->tempDocumentMainPart,               $escapedMacroOpeningChars.'/'.$blockname.$escapedMacroClosingChars, $idx_tag);


    if ( $idx_start === false || $idx_end === false )
      return null;

    $idx_end = mb_strpos($this->tempDocumentMainPart, '</w:p>', $idx_end);

    if ( $idx_end === false )
      return null;

    $idx_end += 6;

    $what = mb_substr($this->tempDocumentMainPart, $idx_start, $idx_end - $idx_start);

    // --- //

    $idx_content_start =   mb_strpos($what, 'p>');
    $idx_content_end   =  mb_strrpos($what, '<w:p ');

    if ( $idx_content_start === false || $idx_content_end === false )
      return null;

    $idx_content_start += 2;

    $xmlBlock = mb_substr($what, $idx_content_start, $idx_content_end - $idx_content_start);

    // --- //

    if ( $replace )
    {
        $by = array();

        if ( $indexVariables )
            $by = $this->indexClonedVariables($clones, $xmlBlock);
        elseif ( $variableReplacements !== null && is_array($variableReplacements) )
            $by = $this->replaceClonedVariables($variableReplacements, $xmlBlock);
        else
            for ( $i = 1 ; $i <= $clones ; $i++ )
                $by[] = $xmlBlock;

        $by = implode('', $by);

        $this->tempDocumentMainPart = str_replace($what, $by, $this->tempDocumentMainPart);
    }

    return $xmlBlock;
}

MichaelB2018 avatar Mar 01 '24 22:03 MichaelB2018