PHPWord
PHPWord copied to clipboard
fix block handling in template processor
- more relaxed but reliable regexp to detect blocks
- support multiple blocks with same name
maybe fixes #2444, #2410 related: #1836, #1354
Checklist:
- [x] I have run
composer run-script check --timeout=0and no errors were reported - [x] The new code is covered by unit tests (check build/coverage for coverage report)
- [x] I have updated the documentation to describe the changes
coverage: 81.467% (-0.005%) from 81.472% when pulling 50e23b26ac37d1df23cd3cfcfa24c24c05d41106 on glaszig:template-processor-blocks into c17c4c7659cbf7fc8a36a39332e9e2deb5d47b59 on PHPOffice:master.
@coveralls don't be that pedantic.
Why don't you use the findContainingXmlBlockForMacro function ?
In some way it has the same logic as your regexp, but your regexp seems complex for big files.
Whereas findContainingXmlBlockForMacro use first findMacro to find it straight, then from the position found, search for the first opening w:p.
I had reflexion about this problem, and my conclusion is not everything should be done by regexp to manage well complexity (else the regexp would use Lookahead or Lookbehind feature that are having complexity far more than O(n)).
I also customized a little bit this function, like
/** * Clone a block. * * @param string $blockname * @param int $clones How many time the block should be cloned * @param bool $replace * @return string|null */ public function cloneBlock($blockname, $clones = 1, $replace = true, $separator = 'linebreak') { // Separator switch($separator){ case "pagebreak": $objectClass = 'PhpOffice\PhpWord\Element\' . 'PageBreak'; $xml_obj = new $objectClass(); $xml_separator = $this->getXmlObject($xml_obj); break; case "linebreak": $objectClass = 'PhpOffice\PhpWord\Element\' . 'TextBreak'; $xml_obj = new $objectClass(); $xml_separator = $this->getXmlObject($xml_obj); break; default: $xml_separator = ''; break; } // Get Block to be Cloned And Generate Cloned Data $rawClonedData = ''; $xmlBlock = null; $where = $this->findContainingXmlBlockForMacro($blockname, 'w:p', 'Block', 1, 'regexp'); if(false !== $where){ $xmlBlock = $this->getSlice($where['start'], $where['end']); $cloned = array_fill(0, $clones, $xmlBlock); $rawClonedData = implode($xml_separator, $cloned); $rawClonedData = str_replace(self::$macroBlockOpeningChars.$blockname.self::$macroBlockClosingChars, '', $rawClonedData); $rawClonedData = str_replace(self::$macroBlockOpeningChars.self::$close_block_char.$blockname.self::$macroBlockClosingChars, '', $rawClonedData); } // Insert Cloned Data if ($replace and (strlen($rawClonedData) > 0)){ $this->replaceXmlBlock($blockname, $rawClonedData, 'w:p', 'Block', 1); } return $where; }
And then customized findMacro to be able to compare different algorithm; protected function findMacro($search, $offset = 0, $macro_type = 'Block', $macro_index = 1, $search_algorithm = 'regexp') { $search = static::ensureMacroCompleted($search, $macro_type); if('regexp' == $search_algorithm){ $pattern_macro = $this->regexp_delimiter.preg_quote($search, $this->regexp_delimiter).$this->regexp_delimiter.'i'; preg_match_all($pattern_macro, substr($this->tempDocumentMainPart, $offset), $matches, PREG_OFFSET_CAPTURE); $pos = (($macro_index >= 1) and (count($matches[0]) >= $macro_index))?($offset + $matches[0][$macro_index-1][1]):-1; } elseif('recursive' == $search_algorithm) { $pos = ($macro_index >= 1)?($offset-1*strlen($search)):false; for ($i=1 ; $i <= $macro_index ; $i++){ $pos = strpos($this->tempDocumentMainPart, $search, $pos+strlen($search)); if($pos === false){ break; } } } else { $pos = strpos($this->tempDocumentMainPart, $search, $offset); } return ($pos === false) ? -1 : $pos; }
Off course, all of this is dependant of the fixBrokenMacros that is also modified to be able to fix merge all the w:t including tha macro without breaking style and other data.
So, sorry for my long post, but the summary is that breaking the problem in function seems to be a good way to organise the TemplateProcessor processing. What do you think ?
Any update on this PR?
BUMP
This doesn't always work. Didn't work for me when used wrapping over a table.
Any update of this fix please?