Object orientation
Wrapping ophir in a class and giving it a namespace would fit more modern workflows.
I am bringing this up as a ticket, because it will change how Ophir works. The code in the readme would then look something like this:
use \lovasoa\Ophir
$ophir = new Ophir();
$ophir->setFootnote(0);
$ophir->setAnnotation(0);
$ophir->setList(1);
$ophir->setLink(1);
echo $ophir->odt2html("/path/to/file.odt");
This would make handling Ophir a lot easier, since it is a object and not a loose Bunch of functions. It is a rather big step to take but i think it is worth it (plus you get super cool Autoloading by Composer)
In my mind it only brings advantages (let alone breaking things, that rely on the old version)
Yes, it's a very good idea! Breaking things is not that terrible, since we are the only two known users of ophir.php ;)
Well, lets do it then. I'll create a dev branch to develop on, also i'll tag the current version as 0.1, so we can tag the object oriented one as 1.0 (to be sematically correct with the Version numbers)
And I think it would be even more readable and easy to extend if we had something like:
$ophir->setConfig(Ophir::FOOTNOTE, Ophir::DISCARD);
$ophir->setConfig(Ophir::LINK, Ophir::IMPORT_TEXT);
Although not committing, this project has been in my mind a lot lately. In particular i thought about having tag objects. It would add a lot more structure to everything. We could have a tag object and objects for images, links, bold ... etc. that inherit from the tag object. And we have a Tag factory producing those tags.
(content here really should be something like content (String,Tag)[], but i don't think UML has the notion of multiple types)
I don't know if this is overkill for Ophir but it would be a lot nicer than the large main method we currently have
Johannes, I think this is a very clever idea. It would make the code much more readable and structured, and thus easier to maintain.
It would also make support for styles easier to add.
But we would need to see if we really have to create our own tag class, or if we can use PHP DOM support.
Do you think this would fit our needs: http://il1.php.net/manual/en/class.domnode.php ?
I think that would be a great base. Although it has everything we need i would extend from that class to multiple other classes like Link, Bold, Italic, etc.
It is not needed per se, but it will make everything a lot easier in the future
@lovasoa I decided to try using a DOMDocument Structure (DOMDocument with DOMNodes). It works well so far. Maybe you could have a look at the code on the dev-experimental branch?
Your work is very interesting, and cleans up the code, which is a good thing.
However, I'm a bit concerned about the memory impact of such a change. The input file is parsed using xmlreader, which means it doesn't have to fit in memory. And using a domdocument for storing the output means that we have to store the full output in memory. The current implementation uses a string, which is also stored in memory, but I think it would be more interesting to have a way to stream the output, than to create a domdocument. The library users could then connect the output stream to an html parser if they want a domdocument.
Reading our previous discussion, I realize that I am the one who proposed to use DOMNodes!
But I can't remember why I said it would make it easier to add support for styles...