PHPWord icon indicating copy to clipboard operation
PHPWord copied to clipboard

fix block handling in template processor

Open glaszig opened this issue 2 years ago • 7 comments

  • more relaxed but reliable regexp to detect blocks
  • support multiple blocks with same name

maybe fixes #2444, #2410 related: #1836, #1354

Checklist:

  • [x] I have run composer run-script check --timeout=0 and no errors were reported
  • [x] The new code is covered by unit tests (check build/coverage for coverage report)
  • [x] I have updated the documentation to describe the changes

glaszig avatar Aug 29 '23 02:08 glaszig

Coverage Status

coverage: 81.467% (-0.005%) from 81.472% when pulling 50e23b26ac37d1df23cd3cfcfa24c24c05d41106 on glaszig:template-processor-blocks into c17c4c7659cbf7fc8a36a39332e9e2deb5d47b59 on PHPOffice:master.

coveralls avatar Aug 31 '23 00:08 coveralls

@coveralls don't be that pedantic.

glaszig avatar Sep 03 '23 02:09 glaszig

Why don't you use the findContainingXmlBlockForMacro function ?

In some way it has the same logic as your regexp, but your regexp seems complex for big files.

Whereas findContainingXmlBlockForMacro use first findMacro to find it straight, then from the position found, search for the first opening w:p.

I had reflexion about this problem, and my conclusion is not everything should be done by regexp to manage well complexity (else the regexp would use Lookahead or Lookbehind feature that are having complexity far more than O(n)).

I also customized a little bit this function, like

/** * Clone a block. * * @param string $blockname * @param int $clones How many time the block should be cloned * @param bool $replace * @return string|null */ public function cloneBlock($blockname, $clones = 1, $replace = true, $separator = 'linebreak') { // Separator switch($separator){ case "pagebreak": $objectClass = 'PhpOffice\PhpWord\Element\' . 'PageBreak'; $xml_obj = new $objectClass(); $xml_separator = $this->getXmlObject($xml_obj); break; case "linebreak": $objectClass = 'PhpOffice\PhpWord\Element\' . 'TextBreak'; $xml_obj = new $objectClass(); $xml_separator = $this->getXmlObject($xml_obj); break; default: $xml_separator = ''; break; } // Get Block to be Cloned And Generate Cloned Data $rawClonedData = ''; $xmlBlock = null; $where = $this->findContainingXmlBlockForMacro($blockname, 'w:p', 'Block', 1, 'regexp'); if(false !== $where){ $xmlBlock = $this->getSlice($where['start'], $where['end']); $cloned = array_fill(0, $clones, $xmlBlock); $rawClonedData = implode($xml_separator, $cloned); $rawClonedData = str_replace(self::$macroBlockOpeningChars.$blockname.self::$macroBlockClosingChars, '', $rawClonedData); $rawClonedData = str_replace(self::$macroBlockOpeningChars.self::$close_block_char.$blockname.self::$macroBlockClosingChars, '', $rawClonedData); } // Insert Cloned Data if ($replace and (strlen($rawClonedData) > 0)){ $this->replaceXmlBlock($blockname, $rawClonedData, 'w:p', 'Block', 1); } return $where; }

And then customized findMacro to be able to compare different algorithm; protected function findMacro($search, $offset = 0, $macro_type = 'Block', $macro_index = 1, $search_algorithm = 'regexp') { $search = static::ensureMacroCompleted($search, $macro_type); if('regexp' == $search_algorithm){ $pattern_macro = $this->regexp_delimiter.preg_quote($search, $this->regexp_delimiter).$this->regexp_delimiter.'i'; preg_match_all($pattern_macro, substr($this->tempDocumentMainPart, $offset), $matches, PREG_OFFSET_CAPTURE); $pos = (($macro_index >= 1) and (count($matches[0]) >= $macro_index))?($offset + $matches[0][$macro_index-1][1]):-1; } elseif('recursive' == $search_algorithm) { $pos = ($macro_index >= 1)?($offset-1*strlen($search)):false; for ($i=1 ; $i <= $macro_index ; $i++){ $pos = strpos($this->tempDocumentMainPart, $search, $pos+strlen($search)); if($pos === false){ break; } } } else { $pos = strpos($this->tempDocumentMainPart, $search, $offset); } return ($pos === false) ? -1 : $pos; }

Off course, all of this is dependant of the fixBrokenMacros that is also modified to be able to fix merge all the w:t including tha macro without breaking style and other data.

So, sorry for my long post, but the summary is that breaking the problem in function seems to be a good way to organise the TemplateProcessor processing. What do you think ?

thomasb88 avatar Sep 10 '23 09:09 thomasb88

Any update on this PR?

fotrino avatar Oct 10 '23 14:10 fotrino

BUMP

kovalovme avatar Oct 24 '23 10:10 kovalovme

This doesn't always work. Didn't work for me when used wrapping over a table.

chinmayshah24 avatar Feb 26 '24 19:02 chinmayshah24

Any update of this fix please?

MrLexisDev avatar Apr 22 '24 14:04 MrLexisDev