OpenGothic icon indicating copy to clipboard operation
OpenGothic copied to clipboard

Replacing ZenLib with lmichaelis/phoenix

Open lmichaelis opened this issue 3 years ago • 8 comments

This is a followup to #267.

The Premise

ZenLib is quite an old project with a lot of broken code, unused features, inconsistent naming and legacy programming patterns. Therefore it might be a good idea to migrate to something more modern. phoenix (written and maintained by myself) is intended to be a rewrite of ZenLib which tries to address these issues. It is currently about 90% completed but has not been fully tested in the real world, though it has a test suite which covers a large part of the codebase.

Switching to phoenix would have at least some value for OpenGothic it seems, however there are some issues which I have laid out below. Before I start looking into migrating, you should consider the advantages and drawbacks I've listed below to make an educated decision as to whether you would like to use phoenix or not.

Improvements over ZenLib

In addition to the test suite, phoenix properly vets all data being parsed and is able to report any inconsistencies it finds with the file formats. This greatly increases our ability to detect and properly implement unknown features should any inconsistencies be found. Similarly, phoenix adds extra safety measures for Daedalus scripts, intended to mitigate any memory corruption or script interface issues. These include, for example, a validator for external function registrations as well as rigorous checks on memory writes from Daedalus into C++-classes.

General Architecture

phoenix uses its own implementation of the VDF file format, removing the need to ship PhysicsFS. mio is used to map files being loaded into memory, greatly improving memory usage. The main C++-class used to read from binary files is the buffer. Instead of converting raw bytes to full structures, primitive types are read one-by-one from the binary file which allows for easily supporting other platforms in the future.

Parsing ZEN archives is done using the archive class. It provides a uniform, type-checked interface for reading all three kinds of archives (ASCII, BINARY and BIN_SAFE).

All structures parsed by phoenix are contained within their own classes and files. The parser for MSH files for example, can be found in mesh.hh while the parser for materials can be found in material.hh. All structures which can be parsed from a file have a static parse(buffer&) or parse(std::unique_ptr<archive>&) function which will parse that structure from the given input.

Current Issues with phoenix

~~phoenix currently does not have the ability to parse MDS and MSB files (as stated in the readme). I will add support for these formats in the coming weeks.~~

phoenix has also not really been tested in the real world yet so there might be serious issues with usability and bugs. I can not really comment on the usability part yet since I don't fully understand the use-cases for these structures in OpenGothic. It would be ideal if you could comment on what annoys you about ZenLib and which features (if any) you would like to see implemented.

phoenix does not have proper documentation at the moment. To be fair, neither has ZenLib, but documentation is an important part of any software. While some of the more complicated API have been documented, a general overview and guide to the library does not exist.

Known Incompatibilites

Since phoenix is a complete rewrite of ZenLib, it has vastly different ideas on the naming scheme and how to expose data to the user. Switching from ZenLib would require a lot of rewriting of OpenGothic code to adaped to the new class names and access patterns. It might be helpful to look into the tools section of phoenix to get an understanding as to how it is used in basic applications (pxmdl converts various model types to .obj files, pxtex converts .TEX files to TGA, pxinfo can dump information about various files, pxscrdump can print out details about Daedalus scripts [I'm even working on an actual decompiler :>] and pxvdfs can extract VDF files).

phoenix also uses glm which conflicts with Tempest's vector and matrix types. I'm not sure how we would solve this

phoenix uses ~~at least one~~ some C++20 features like std::span and requires. This leads to phoenix not properly compiling with C++17 (which you are using). A solution would be to update to C++20.

Current Status

I'm, working on this at lmichaelis/OpenGothic:zenlib-migration. Here's a rough outline of what needs to be done:

  • [x] Make all file paths in phoenix use std::filesystem for platform independent file access
  • [x] Migrate from ZenLib's FileIndex to phoenix's vdf_file
  • [x] Replace ZenLib's zC*-classes to their *phoenix` counterparts
    • [x] zCMesh -> mesh
    • [x] zCFont -> font
    • [x] zCMeshSoftSkin -> softskin_mesh
    • [x] zCCSLib -> messages
    • [x] zCMaterial -> material
    • [x] zCModelAni animation
    • [x] zCProgMeshProto -> proto_mesh
    • [x] zCVob -> vob_tree and vob::*
    • [x] ztex2dds -> texture (=> requires changes in phoenix due to DXT)
    • [x] zCMorphMesh -> morph_mesh
    • [x] MdsParserTxt -> model_script
    • [x] MdsParserBin -> model_script
  • [x] Migrate from ZenLib's DATFile and DaedalusVM to phoenix's script (subject to be renamed to daedalus_script) and daedalus_interpreter
    • [x] Fixup order of execution ambiguity in exec()
    • [x] Add capability to override Daedalus functions

lmichaelis avatar May 27 '22 20:05 lmichaelis

phoenix uses at least one C++20 feature, std::span. This leads to phoenix not properly compiling with C++17 (which you are using). A solution would be to update to C++20.

Hmm, can't say that I like the idea of bumping compiler version only for the sake of std::span. How difficult it would be to remove span from dependencies?

Note about zCMesh/zCMeshSoftSkin - Packed* versions of those classes are no longer in use since b642641b - so those can be dropped completely.

daedalus_interpreter

daedalus_vm is probably still OK name.

validator for external function registrations

There is a Ikarus/LeGo ticket #231 - basically we need a way to override user-defined script functions (to trap MEM_Write*) and some way to handle out-of-bounds access. Something, like callback would be great.

phoenix also uses glm which conflicts with Tempest's vector and matrix types. I'm not sure how we would solve this

Should just work. At this point in project, there is Tempest + Bullet + zMath libraries duplicating each-other. One more, or one less - all the same.

Nitpick: https://github.com/lmichaelis/phoenix/blob/c1bea35a6f9fc2d65571052d7cdbc4c71360dcf2/source/daedalus/interpreter.cc#L56

push_int(pop_int() - pop_int());

Will it work correct? Strictly speaking both calls of pop_int are in same sequence point and can be called in any order.

Try avatar May 29 '22 18:05 Try

There is a Ikarus/LeGo ticket https://github.com/Try/OpenGothic/issues/231 - basically we need a way to override user-defined script functions (to trap MEM_Write*) and some way to handle out-of-bounds access. Something, like callback would be great.

Hm I see. This is their repo, right?. Currently, overwriting functions is not supported but I think I can make it work. I mean the only thing we'd have to do is remove the original function, say MEM_WriteInt, from the symbol table and replace it with an external function definition which we can then register a callback for. I can even make it check that the callback being set has the proper arguments.

This should take care of reading/writing memory too, right? Unless these mods actually try to write into memory of the original interpreter or engine we should just be able to provide a large vector<uint8_t> for the scripts to write to.

Now if they actually overwrite memory used for anything else (like altering the symbol table from within the script) it'll be a lot harder to deal with but there might be a way.

Nitpick: Will it work correct? Strictly speaking both calls of pop_int are in same sequence point and can be called in any order.

I've not had any issues with it during my limited testing but you have a point. As they say, if you make use of undefined behavior, the compiler is allowed to burn your house down :D

Hmm, can't say that I like the idea of bumping compiler version only for the sake of std::span. How difficult it would be to remove span from dependencies?

It should be pretty straightforward. I was kinda hesitant because of the performance implications of using a std::vector instead but now I realize that I convert the span into a vector regardless :)


Should just work. At this point in project, there is Tempest + Bullet + zMath libraries duplicating each-other. One more, or one less - all the same.

Yeah it should be fine for now. It might be a good idea to use the same library though since introducing fragmentation leads to weird issues.

That being said, I'm making good progress. I've been able to replace a lot of the ZenLib usage and have it compile. It's actually able to launch but loading into a new game makes it crash so I'll have to investigate that. Looks promising tho :>

lmichaelis avatar May 29 '22 19:05 lmichaelis

phoenix uses at least one C++20 feature, std::span. This leads to phoenix not properly compiling with C++17 (which you are using). A solution would be to update to C++20.

While checking to downgrade I realized that I'm using part of the new concepts library for the VM implementation (example). While it is possible to downgrade that as well (using the old-style template <typename T, typename = std::enable_if<...>::value> syntax), I'd like to keep it as-is for readability's sake.

@Try If you really don't want to upgrade to C++20, I will make it work, but since basically all compilers support it and no major API changes have been introduced there is probably no downside to upgrading.

lmichaelis avatar May 30 '22 05:05 lmichaelis

@lmichaelis ok, if this is major part of VM - than it's fine. Basically I'm lazy on doing followup work to C++ version bump (like rewriting stuff, like (T*,size) to std::span :D

Btw, how your binding framework will work for npc's and items? At this point OpenGothic pops ID and then searches for it on it's own. Will it be same or npc will be passed as c_npc ?

Some more nitpicks:

void override_function(const std::string& name, const T& cb)

This can be good opportunity to use std::string_view, so string literals wont be promoted to heap memory.

Another performance-like stuff is Daedalus strings. In OpenGothic at one point I has to update them to CoW strings. That allowed to store them in AiQueue as-is, without copying.

Since it's C++20, why use fmt library, but not std::format?

Try avatar May 30 '22 19:05 Try

Basically I'm lazy on doing followup work to C++ version bump (like rewriting stuff, like (T*,size) to std::span :D

Well you don't have to change that .. It's not like it becomes illegal in C++20 to use raw pointers :D

But I was gonna propose some changes to OpenGothic anyways since I've found a whole bunch of issues while working on this migration (I'll open a PR for those tho) ;)

Btw, how your binding framework will work for npc's and items? At this point OpenGothic pops ID and then searches for it on it's own. Will it be same or npc will be passed as c_npc ?

So right now, instances work like this: You write a class like C_NPC which inherits from instance (this is important for memory-safety). This class then contains all fields of its Daedalus counterpart which have be registered to the symbols in the script using script::register_member. Then, when you want to initialize any given instance, you call daedalus_interpreter::init_instance which will then in turn call the appropriate script function and give you a std::shared_ptr<C_NPC> back, saving a copy with the script symbol. You then deal with the instance however you need to (like saving it in a std::unordered_map).

When returning from an external, you then just return that shared_ptr again and the VM will know what to do.

This can be good opportunity to use std::string_view, so string literals wont be promoted to heap memory.

Generally I use std::string_view wherever possible. In this case it isn't because the underlying script::find_symbol_by_name needs a std::string to index into a std::unordered_map<std::string, symbol*>. In this case, I would have to convert the string_view back into a string so passing it directly as a string actually avoids unnecessary copies.

I've seen this pattern a lot in OpenGothic too. You'd pass a std::string_view but then you would immediately convert it back into a std::string. That is actually less performant since it has to allocate a new std::string every time you do that so you end up throwing away a perfectly good std::string which you could have gotten from the caller most of the time.

Since it's C++20, why use fmt library, but not std::format?

I would like to use std::format but when I started writing phoenix, it was actually not supported by GCC or Clang. I don't know how the situation is by now but if it has changed, I'd happily switch :>

lmichaelis avatar May 30 '22 20:05 lmichaelis

script::find_symbol_by_name needs a std::string to index into a std::unordered_map<std::string, symbol*>

This was actually fixed in C++20: now there is 2 new overload to ::find, so it can take else type, unrelated to Key. This should allow to avoid unnecessary string_view to string casts.

I've seen this pattern a lot in OpenGothic too.

Yes, that was in resources.cpp due to C++17 limitation with unordered_map. If there are else cases - let me know. string_view to string cast is generaly not intended and should be fixed.

I would like to use std::format but when I started writing phoenix, it was actually not supported by GCC or Clang.

Can you double check it please? If there are compiler issues with C++20 std library, that can be a blocker. Otherwise - can't wait to drop std:snprintf's :D

Try avatar May 31 '22 16:05 Try

Can you double check it please? If there are compiler issues with C++20 std library, that can be a blocker. Otherwise - can't wait to drop std:snprintf's :D

Hm so it looks like libc++ (Clang) does support it since version 14 but libstdc++ (GCC) still doesn't. They're taking a really long time to implement it :| Should be fine though. As soon as it arrives in libstdc++ I will get rid of fmtlib.

Until then, you could just use fmtlib tho ;)

This was actually fixed in C++20: now there is 2 new overload to ::find, so it can take else type, unrelated to Key. This should allow to avoid unnecessary string_view to string casts.

Ooo I didn't know this! I shall fix my unordered_map accesses then!

lmichaelis avatar May 31 '22 17:05 lmichaelis

Heyo @Try, I've opened a PR since I'm done moving everything except the VM over to phoenix. I need help to weed out some bugs though (see the PR). I'm hoping you'll have an easier time finding the cause since you're more familiar with the codebase :>

lmichaelis avatar Jun 04 '22 15:06 lmichaelis