openage icon indicating copy to clipboard operation
openage copied to clipboard

Document .dat specification

Open heinezen opened this issue 3 years ago • 7 comments

Required skills: Reading code

Difficulty: Easy

The Age of Empires .dat format is AoE's messy way of storing game data and unit stats. There is currently no official specification made available which makes writing and testing parsers very difficult. The format also changes with every game or expansion (and even some DE2 updates). It is not a great situation overall.

We should therefore create an unofficial specification of the several .dat file version that are floating around with added descriptions and comments if available. Our converter code already includes descriptions for some of the entries. More information about the differences between the versions can be found in the genieutils repository.

The specification document could be created as a table that looks roughly like this:

Entry name Data type Description
hp int16 Hit points
line_of_sight float Line of sight

Extra: If we want to go further, we could also add a nice HTML doc generated from the markdown. The agescx format doc is a good example for how this could look.

Further Reading

heinezen avatar Aug 23 '20 16:08 heinezen

fwiw, I ran doxygen on genieutils and uploaded the generated API doc to my server, in case it's helpful. Not everything is as well-documented as I'd like, but in case someone want to avoid reading messy C++:

I'll link directly to the Unit class because that actually has some documentation (and you can click around for the rest): https://iskrembilen.com/genieutils-api/classgenie_1_1Unit.html

sandsmark avatar Aug 29 '20 18:08 sandsmark

@sandsmark Thanks :) Although for me it's mostly the rapidly increasing number of .dat versions that makes reading the genieutils C++ code and openage converter Python code confusing. I think there are over 20 different file versions nowadays.

Also, I didn't remember at the time I wrote the issue description, but the agescx format doc is the best shot at a format doc so far. It would be a good example for what we should aim for. The only thing that we would need to add to it is a format version selector, so one can look at the structure of one specific version. Edit: I've added the suggestion to the issue description.

heinezen avatar Sep 02 '20 01:09 heinezen

I think there are over 20 different file versions nowadays.

Depends on how you count, I guess, but this is the (now probably outdated) list of versions genieutils supports (in some form) at least: https://github.com/sandsmark/genieutils/blob/master/src/dat/DatFile.cpp#L80-L118

Also, I didn't remember at the time I wrote the issue description, but the agescx format doc is the best shot at a format doc so far. It would be a good example for what we should aim for. The only thing that we would need to add to it is a format version selector, so one can look at the structure of one specific version. Edit: I've added the suggestion to the issue description.

That looks pretty nice, actually, but a format selector would be nice indeed (even more so for the scenario files, tbh., they're a complete mess).

sandsmark avatar Sep 04 '20 17:09 sandsmark

Hello! I have stumbled upon this issue and really want to help, however I don't understand how to read .dat files. I have cloned and compiled genieutils, but the problem is that I don't know what to do with the .dll file and how to use it to convert .dat into readable text.

Thank you very much in advance!

svyshlov avatar Nov 17 '20 01:11 svyshlov

@svyshlov Hey! :)

genieutils can only read and write in the .dat format and cannot output readable text. I linked it because the the source code contains a lot of descriptions and version checks. Sorry for the confusion.

The best way (I think) to get a readable format from the .dat is to call the get_data_format() methods in the openage converter:

It outputs human readable function names and data types among other things. One could for example write a custom Python script that calls the methods and generates Markdown output from that. The descriptions have to be added by hand, although a lot can be found in the code comments.

heinezen avatar Nov 17 '20 21:11 heinezen

@sandsmark Hello!:) Thanks for the reply. Do I need to compile the whole openage to use converter methods? I have tried to use openage as a whole, but got 'UnionPath' object has no attribute 'cfg_dir' error.

If I only use converter, I get:

bin/run convert --source-dir /media/sf_AoE2DE/ --force
INFO [py] Game edition detected:
INFO [py]  * Age of Empires 2: Definitive Edition
INFO [py] converting metadata
INFO [py] [0] palette
INFO [py] [1] empires.dat
INFO [py] using cached wrapper: /tmp/empires2_x2_p1.dat.pickle
INFO [py] Starting conversion...
INFO [py] Extracting Genie data...
INFO [py] Creating API-like objects...
INFO [py] Linking API-like objects...
INFO [py] Generating auxiliary objects...
INFO [py] Creating nyan objects...
Traceback (most recent call last):
  File "run.py", line 15, in init run
    main()
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/__main__.py", line 132, in main
    return args.entrypoint(args, cli.error)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/main.py", line 195, in main
    if not convert_assets(outdir, args, srcdir):
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/main.py", line 77, in convert_assets
    for current_item in convert(args):
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/tool/driver.py", line 45, in convert
    yield from convert_metadata(args)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/tool/driver.py", line 90, in convert_metadata
    modpacks = args.converter.convert(gamespec,
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/processor.py", line 55, in convert
    modpacks = cls._post_processor(data_set)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/processor.py", line 145, in _post_processor
    DE2NyanSubprocessor.convert(full_data_set)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/nyan_subprocessor.py", line 38, in convert
    cls._process_game_entities(gamedata)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/nyan_subprocessor.py", line 120, in _process_game_entities
    cls.tech_group_to_tech(tech_group)
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/nyan_subprocessor.py", line 596, in tech_group_to_tech
    patches.extend(DE2TechSubprocessor.get_patches(tech_group))
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/tech_subprocessor.py", line 147, in get_patches
    patches.extend(cls.resource_modify_effect(converter_group,
  File "/home/alexander/git/openage/openage/.bin/g++-debug-Oauto-sanitize-none/openage/convert/processor/conversion/de2/tech_subprocessor.py", line 276, in resource_modify_effect
    upgrade_func = DE2TechSubprocessor.upgrade_resource_funcs[resource_id]
KeyError: 208

I run linux as a virtual machine, with AOE files installed on my windows and mounted to linux as external drive. Maybe that is causing the issues?

P.S. I hope that I am at least at the right direction:)

svyshlov avatar Nov 21 '20 02:11 svyshlov

@svyshlov You don't need to compile the openage converter for this issue (only a few of the media parsers use Cython for extra speed). Also, the error you encounter is probably caused by the fact that AoE2:DE just got an update that I haven't looked into yet :D However, you don't have to worry about this since the .dat parser runs through without errors (see INFO [py] [1] empires.dat).

You are certainly going in the right direction. The easiest way to approach this is to write a Python script that includes the .dat parsers as modules. This script doesn't need to be part of the converter (yet). Then you can export the member list from the .dat structs and find a way to get them into a documentation format :)

heinezen avatar Nov 21 '20 12:11 heinezen