PHPWord
PHPWord copied to clipboard
List item values are missing from docx file while convert as HTML file
/* Here is my code*/
$PHPWord = new \PhpOffice\PhpWord\PhpWord();
$PHPWordLoad = \PhpOffice\PhpWord\IOFactory::load($file);
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($PHPWordLoad, 'HTML');
$tmpfname = public_path('doczipfiles/temp.html');
$htmlWriter ->save($tmpfname);
Up to this. I encountered the same issue as well.
same here
I also face the same issue and found that "ListItemRun.php" is missing from the PATH "src\PhpWord\Writer\HTML\Element" which is causing the issue.
I added the file and made the changes. I got the list value but was missing the bullet icon. I am currently trying to fix this issue. In the mean while if anyone want the file, please let me know
+1
@vineetagarwal1981 I could use the listItemRun.php file if you could make it available please. Just the data with out the bullets is sufficient for my needs. Thanks,
@vineetagarwal1981 I could use the listItemRun.php file if you could make it available please. Just the data with out the bullets is sufficient for my needs. Thanks,
@bozzit Below is the file that you require. Place this file at the PATH "src\PhpWord\Writer\HTML\Element"
Let me know if it's working for you
@vschavala If you just want to convert html, I find it's better to convert with more specific tools. This is my solution and works better than PHPWord: https://gist.github.com/lubobill1990/701df4becce20af43e9122a26dc52a05
The main purpose of PHPWord is to compose a word document with PHP, but not convert between formats.
@vineetagarwal1981 I could use the listItemRun.php file if you could make it available please. Just the data with out the bullets is sufficient for my needs. Thanks,
@bozzit Below is the file that you require. Place this file at the PATH "src\PhpWord\Writer\HTML\Element"
Let me know if it's working for you
Hi Yes thank you, If I have time I will attempt to make it output a unordered lists instead of the text within
elements.
at least I'm not loosing the text within the lists by adding this file.
Same here.
If I generate a word document like here I will get this file structure:
- _rels
- theme
- document.xml
- fontTable.xml
- footnotes.xml
- numbering.xml
- settings.xml
- styles.xml
- webSettings.xml
With HTML generated lists this:
- _rels
- theme
- endnotes.xml
- fontTable.xml
- footer1.xml
- header1.xml
- settings.xml
- styles.xml
- stylesWithEffects.xml
- webSettings.xml
So you can see there is no numbering.xml. And also if you try to use libreoffice to generate a pdf all lists are empty.
@kristl78 https://github.com/PHPOffice/PHPWord/issues/1462#issuecomment-438691752 I'm using this solution and it works well. Hope it can help.
@lubobill1990 thank u but this is unfortunately not enough.
I've used the solution from @vineetagarwal1981 but modified it a bit. The list items were not parsed as li tags, which I need for my project.
public function write() { if (!$this->element instanceof \PhpOffice\PhpWord\Element\ListItemRun) { return ''; } $content = '
'; $content .= $this->element->getElement(0)->getText(); $content .= ' '; return $content; }
I just install PHPWord using composer (6/8/2020) I also see the above problem (loss of list text) when attempting to convert a .docx to .html.
The version of PHPWord I installed did have the file mentioned above ListItemRun.php in the proper directory. However I still had the error.
I also attempted to copy the file ListItemRun.php provided by @vineetagarwal1981 above into the element directory overwriting the installed copy of ListItemRun.php and that generated several exceptions. Therefor I backed that change out.
Has there been any resolution on how to convert .docx list to Html without losing the text ??
I've used the solution from @vineetagarwal1981 but modified it a bit. The list items were not parsed as li tags, which I need for my project.
public function write() { if (!$this->element instanceof \PhpOffice\PhpWord\Element\ListItemRun) { return ''; } $content = '* '; $content .= $this->element->getElement(0)->getText(); $content .= ''; return $content; }
Hi Could you provide a little more specifics for example in which php file did you place this code?
Hello @PhoenixRising2015! i have the same problem, did you find a solution?
@Hector1567XD
Just create a file called.
"ListItemRun.php" in PATH "src\PhpWord\Writer\HTML\Element" With that code in it or look up in this thread there is a link ti a ZIP file with the "ListItemRun.php" in it.
Same here: I have the following list with numbers in docx file:
1. a
2. b
3. c
After converting to HTML file:
a
b
c
for anyone having this issue the solution by @tikumo works
public function write()
{
if (!$this->element instanceof \PhpOffice\PhpWord\Element\ListItemRun) {
return '';
}
$content = '';
$content .= '<ul><li>';
$content .= $this->element->getElement(0)->getText();
$content .= '</li></ul>';
$content .= "\n";
return $content;
}
Replace the function write() in src\PhpWord\Writer\HTML\Element\ListItemRun.php with the code above and it will transform any listItemRun into a li element, however there is no way to create the parent ul for the lists afaik so I modified the function and make every list item a separated list as a temporary solution. If anyone has any solution for making the ul elements please let me know
What I ended up doing is: I modified phpoffice/phpword/vendor/phpoffice/phpword/src/PhpWord/Writer/HTML/Element/ListItemRun.php
protected function writeOpening()
{
$content = sprintf('<li data-depth="%s" data-liststyle="%s" data-numId="%s">', $this->element->getDepth(),
$this->element->getListFormat($this->element->getDepth()),
$this->element->getListId());
return $content;
}
Then created my own writer that extends AbstractWriter
class MyHtmlWriter extends AbstractWriter implements WriterInterface
{
.
.
.
/**
* Get content
*
* @return string
*/
public function getContent()
{
$content = $this->getWriterPart('Body')->write();
$lines = explode(PHP_EOL, $content);
$newcontent = '';
foreach ($lines as $line)
{
if (preg_match('/( |^)<li data-depth/', $line))
{
/** use the data-depth, data-liststyle and data-numid to add <ul> </ul> <ol></ol>
* where needed
* /
}
else
{
$newcontent .= $line;
}
}
$content = $newcontent;
.
.
.
return $content;
}
Hope this points you @Lurtz963 in the right directions.
What I ended up doing is: I modified phpoffice/phpword/vendor/phpoffice/phpword/src/PhpWord/Writer/HTML/Element/ListItemRun.php
protected function writeOpening() { $content = sprintf('<li data-depth="%s" data-liststyle="%s" data-numId="%s">', $this->element->getDepth(), $this->element->getListFormat($this->element->getDepth()), $this->element->getListId()); return $content; }Then created my own writer that extends AbstractWriter
class MyHtmlWriter extends AbstractWriter implements WriterInterface { . . . /** * Get content * * @return string */ public function getContent() { $content = $this->getWriterPart('Body')->write(); $lines = explode(PHP_EOL, $content); $newcontent = ''; foreach ($lines as $line) { if (preg_match('/( |^)<li data-depth/', $line)) { /** use the data-depth, data-liststyle and data-numid to add <ul> </ul> <ol></ol> * where needed * / } else { $newcontent .= $line; } } $content = $newcontent; . . . return $content; }Hope this points you @Lurtz963 in the right directions.
I tried this solution but data-depth is always 0 and the rest of the attributes are empty
My bad, forgot I had to implement, some of those methods for the other Attributes, and 0 is normal for depth if you don't have nested lists. Top level List is always 0.
index 6e48a69..ed83162 100644
--- a/3rdparty/phpoffice/phpword/vendor/phpoffice/phpword/src/PhpWord/Element/ListItemRun.php
+++ b/3rdparty/phpoffice/phpword/vendor/phpoffice/phpword/src/PhpWord/Element/ListItemRun.php
@@ -73,6 +73,24 @@ class ListItemRun extends TextRun
return $this->style;
}
+ public function getListFormat($depth)
+ {
+ if (isset($this->style->bulletListType[$depth]->format))
+ {
+ return $this->style->bulletListType[$depth]->format;
+ }
+ else
+ {
+ return 'bullet';
+ }
+
+ }
+
+ public function getListId()
+ {
+ return $this->style->numId;
+ }
+
After a bit of struggle I was able to implement a similar solution @bozzit , for some reason I couldn't use a custom writer (It throws the error that is not a valid writer) so I modified HTML writer. I let the files here in case someone wants to use it or make a better version. ListItemRun.php goes in phpoffice/phpword/src/PhpWord/Writer/HTML/Element and HTML.php goes in phpoffice/phpword/src/PhpWord/Writer
After a bit of struggle I was able to implement a similar solution @bozzit , for some reason I couldn't use a custom writer (It throws the error that is not a valid writer) so I modified HTML writer. I let the files here in case someone wants to use it or make a better version. ListItemRun.php goes in phpoffice/phpword/src/PhpWord/Writer/HTML/Element and HTML.php goes in phpoffice/phpword/src/PhpWord/Writer
Thanks! Your code helped me)) I just added a loop to the function write in ListItemRun.php
public function write()
{
if (!$this->element instanceof \PhpOffice\PhpWord\Element\ListItemRun) {
return '';
}
$content = '';
$content .= sprintf('<li data-depth="%s" data-liststyle="%s" data-numId="%s">', $this->element->getDepth(),
$this->getListFormat($this->element->getDepth()),$this->getListId());
$size_content = $this->element->countElements();
for ($i=0; $i < $size_content; $i++){
$content .= $this->element->getElement($i)->getText();
}
$content .= '</li>';
$content .= "\n";
return $content;
}
It's been 6 years and ListItemRun.php is still not implemented. Pretty crazy.
In any case, I took @CaptBarbarossa's code and extended it to handle all types of elements in the li, since there is no guarantee that a li only contains text:
<?php
namespace PhpOffice\PhpWord\Writer\HTML\Element;
/**
* ListItemRun element HTML writer
*
* @since 0.10.0
*/
class ListItemRun extends TextRun
{
public function write()
{
if (!$this->element instanceof \PhpOffice\PhpWord\Element\ListItemRun) {
return '';
}
$content = '';
$content .= sprintf(
'<li data-depth="%s" data-liststyle="%s" data-numId="%s">',
$this->element->getDepth(),
$this->getListFormat($this->element->getDepth()),
$this->getListId()
);
$namespace = 'PhpOffice\\PhpWord\\Writer\\HTML\\Element';
$container = $this->element;
$elements = $container->getElements();
foreach ($elements as $element) {
$elementClass = get_class($element);
$writerClass = str_replace('PhpOffice\\PhpWord\\Element', $namespace, $elementClass);
if (class_exists($writerClass)) {
/** @var \PhpOffice\PhpWord\Writer\HTML\Element\AbstractElement $writer Type hint */
$writer = new $writerClass($this->parentWriter, $element, true);
$content .= $writer->write();
}
}
$content .= '</li>';
$content .= "\n";
return $content;
}
public function getListFormat($depth)
{
return $this->element->getStyle()->getNumStyle();
}
public function getListId()
{
return $this->element->getStyle()->getNumId();
}
}
The true as the last argument to new $writerClass($this->parentWriter, $element, true); prevents text from being wrapped in <p> tags so that everything inside the li is displayed inline.
If you're installing this package with composer (like I am), you can use the post-install-cmd hook in your composer.json file to copy this file into ./vendor/phpoffice/phpword/src/PhpWord/Writer/HTML/Element/ListItemRun.php every time the package is installed