Kuriimu
Kuriimu copied to clipboard
Media.Vision Archive (MVGL)
I'm not going to specify the game as this format is common in Media.Vision games:
- Wild Arms
- Chaos Rings
- Digimon Story
Format extension(s): .(platform).MVGL --> (platform) can be: psp2, steam and nx64 (switch) Type: archive First 8 bytes of the file(s): 0c92 8d60 bafc 6a9c - ...`..j. (As the file is compressed/encoded/encrypted, there's the probability this will not help at all)
More details: Fortunately, the Vita format is already known, we know that the compression in Vita could've an implementation of DOBOZ compression, I know most of us don't like this, but you can find a BMS script here: https://forum.xentax.com/viewtopic.php?f=10&t=12666 This will help a lot for the PSP2 variant, but the real challenge comes with the compression/encoding of the Steam variant, so be careful with that one. I have some clues about this, but they are probably mistook for CRIWARE SDK holdovers in the executable binary. I'll continue my investigation and take a look at the Switch variant.
Sample files (if possible): https://mega.nz/#!NqxQQKgS!VJPkcQ3adBB1jUmgTtoP8r9C3TuVFMMNlyl7YWYenqQ (PC - encrypted) https://mega.nz/#!siwH2ALa!TN98qL2E3jrG8cWXOU3zuEzCYpi934YfeSXIjPMLdzM (Switch - decrypted)
My research uncovered that the Switch version is the same format as the Vita version, this makes everything easier as we can also compare between the PC obfuscated version and the Switch unobfuscated version. Why I say obfuscated instead of compressed? Because both files have the same file size, that's why I think it's not another layer of compression, instead, they tried to obfuscate the files. Or encrypt them... Switch version of the file attached to the issue
UPDATED
Histograms:
Encrypted:
Decrypted:
~~High probability of XOR encryption found in the research, actually looking at that.~~ It's probably an inhouse encryption, Steam version needs a research in the executable, we can start working with Vita and Switch.
Just found this via google because I was curious whether someone else worked on the format.
Since I did some research on it (and wrote this tool for Digimon Story) I figured I could provide some insights. Everything I describe is for Digimon Story Cyber Sleuth: Complete Edition (Steam). I suspect it applies to other instances of MediaVision games.
1. Encryption on Steam
The .mvgl files are all encrypted by XORing the data twice using two keys I was able to extract. (Keys) (Code)
This process reversible, i.e. using the function on encrypted data will decrypt and vice versa.
2. MVGL types
The decrypted files can be classified in three types.
2.1. AFS2 (bgm, PDSEbgm, vo, vous, Pvo)
These are actually not encrypted in the first place and most likely contain audio files. I haven't done any work on this format yet.
2.2. OTTO (media/M10X)
These are simple open type font files. Rename them to .otf and you can import them into your OS.
2.3. MDB1 (DSDB, A, P, Pse, S, SE and SP)
This is the actually interesting part, since it contains the text. It's an archive format in effectively 5 parts, in sequence:
2.3.1. Header
The header is 20 bytes.
struct MDB1Header {
uint32_t magicValue;
uint16_t fileEntryCount;
uint16_t fileNameCount;
uint32_t dataEntryCount;
uint32_t dataStart;
uint32_t totalSize;
};
2.3.2. File Entry Table
Each file in the archive has a 8 byte file entry that together form a bitwise binary search tree based on the file path.
struct FileEntry {
int16_t compareBit;
uint16_t dataId;
uint16_t left;
uint16_t right;
};
When searching the archive for a given path it will start at the root and compare the bit denoted by compareBit
whether it's true or false. If it's true it will continue with the FileEntry pointed to be right, or left otherwise.
If the compare bit of the next entry has a compareBit
smaller or equal than the current one it is a leaf and the data
value points to the DataEntry of the file.
Reconstructing the tree is a bit more difficult and my code isn't re-creating it in 100% the same way, but should fulfil the same constraints. The code can be found here. (I didn't do a good job documenting it :( )
2.3.3. File Name Entry Table
This is quite straight forward, 64 bytes for the name, 4 containing the extension (padded with ' ') and 60 for the path (null terminated). The path entries relate in their position to the File Entries.
2.3.4. Data Entry Table
Again 12 bytes for each file, referenced by the File Entry Table.
struct DataEntry {
uint32_t offset;
uint32_t size;
uint32_t compressedSize;
};
2.3.5. Actual Data
As already identified this is compressed data using the doboz compression algorithm. The games may be able to handle uncompressed data as well, but since the MDB1 uses 32-bit pointers the file can't grow larger than 4GiB in total.
E.g. the Cyber Sleuth main archive is 12.7 GiB extracted but 2.6 GiB compressed, making compression necessary.
3. Archive contents, i.e. .mbe/EXPA (may only apply to Cyber Sleuth!)
Inside the archive there are a number of files with different formats (that may vary between platforms/games), but since this seems to be a translation tool text files are the most relevant.
The game uses a relatively simple table format for it's data, including text, files. They all have a .mbe file extension and an EXPA magic value.
3.1. EXPA Header
The header are 2x4 bytes, the magic value and the number of sheets/tables in the file.
3.2. EXPA Table Header
uint32_t nameSize;
char* name;
uint32_t entrySize;
uint32_t entryCount;
};
The nameSize and name are padded to 4 bytes, adding two terminator bytes to it (so "Meramon" would have a nameSize of 0xC instead of 0x8). entrySize
is the number of bytes per entry.
3.3. EXPA Table
There is nothing in the format to define it's internal data structure, so it's up to the parser.
So far I've been able to identify 3 different data types:
- int
- 32-bit in the table
- aligned to 4 byte
- string
- 64-bit in the table, pointer (always 0 on disk)
- aligned to 8 byte
- int array
- 64-bit in the table, pointer (always 0 on disk)
- aligned to 8 byte
For elements of variable size (strings, int arrays) the 64-bit values are placeholders for pointers that get replaced when the file is loaded in-engine from the CHNK section.
When bytes are skipped for alignment purposes they're filled with 0xCC.
3.4. CHNK
After all the EXPA tables the CHNK section begins, which contains all the variable size data.
struct CHNKHeader {
uint32_t magicValue;
uint32_t numEntry;
};
When loading the file the game will paste a pointer to the CHNKEntry to the specified offset in the file.
struct CHNKEntry {
uint32_t offset;
uint32_t size;
char* string;
};
3.5. Structure of text files
In Cyber Sleuth the relevant text files are located in the /message/ and /text/ folders.
The /message/*.mbe files all contain a single table named "Sheet1" with the structure
struct Message {
uint32_t messageId;
uint32_t speakerId; // see text/charnames.mbe
string unknownLanguage1;
string english;
string chinese;
string unknownLanguage2;
string korean;
string german;
}
The /text/*.mbe files are the same except that they don't have a speakerId (it's 0xCCCCCCCC due to alignment) and that in some files the table name is "para" or "event".
4. End
I hope that information without errors – some assumptions I made may turn out being wrong and some elements are not fully developed yet (e.g. the EXPA data types). Given that I've been contacted by people who may want to create translations for this game I hope all of this is useful to someone. :)
Sorry for the lack of answer, I've continued the investigation. Apart from @SydMontague investigation, I took a look at the image and video formats.
Image
An easy one, DDS header and ABGR8 channels found in the PC version, probably a direct conversion from the Vita version, due to the endianness there's a possibility that the Switch version uses a RGBA8 encoding.
You can open the files with the Intel Texture Works plugin, and saving them without compression (None 32bpp) outputs a file with the same filesize.
Video
This one is bit more tricky, USM format is common, but the contents of this container have VP9 encoded videos and, probably, encrypted.
I've found in the specification the possibility of encryption in this kind of formats.
When you demux the USM format with VGM Toolbox, the output is a IVF container.
Unfortunately, this "encryption" (not confirmed) is also found in the Switch version.
Metadata: vp9 (libvpx-vp9), yuv420p, 960x540, q=-1--1, 29.97fps, 1k tbn, 29.97 tbc
About the possibility of an encryption in the USM files, I found tools related with that, and they say that...
CRID USM extractor v1 by nyaga (https://github.com/Nyagamon) v2 by bnnm (added new options, xorkey files) v3 by bnnm (added audio stream selection)
Some .USM are encrypted using a 64b key (like HCA) so when demuxing the .adx you get bad audio.
How to decrypt:
- unzip CRID(.usm)Demux Tool v1.01-mod
- find game's USM key, often same as HCA key (see hca_keys.h) example, in hex: 006CCC569EB1668D
- call crid_mod.exe with key divided in two to extract .adx crid_mod.exe -b 006CCC56 -a 9EB1668D -x Opening.usm
- or flags to extract video, info and stuff: crid_mod.exe -b 006CCC56 -a 9EB1668D -i -v Opening.usm
If your .usm has multiple audio streams don't forget to use -s N to extract one by one (manually ATM) or you'll get a garbled .adx
If you don't have the USM key you may still be able to decrypt audio. The key is used to get a 0x20 xor, and .adx often have long silent (blank) parts, meaning easy keys.
- open .usm in hex editor, look for "@SFA" chunks
- if chunk is bigger than 0x140 try to find repeating 0x20 patterns, that's the key (encryption is only applied after 0x140, some chunks are smaller than that)
- if no patterns try other videos, silence/blank is often at the beginning or end of files
- save 0x20 pattern key into key.bin
- call CRID: crid_mod.exe -m key.bin Opening.usm
Video uses a slightly different method I don't think you can get as easily.
Needless to say you need to original .usm, can't use vgmtoolbox to demux first.
-- bnnm
We are searching a key in the executable right now.
HCA key for USM files: -b 283553DC -a E3FD5FB9
@Megaflan can you describe how did you get the key from executable? And does the key works with video?