CUE4Parse icon indicating copy to clipboard operation
CUE4Parse copied to clipboard

Chunked archive readers for large files

Open yretenai opened this issue 1 year ago • 6 comments

Allows large files to be read. The issue occurs because these files (usually .umaps) are over 2 GB which is too big for an array. This new reader loads the file in chunks stored in a 128 MB buffer.

This might prevent the issue described in #95.

I briefly tested this and it seems to work, but due to the minor refactor in PakFileReader, I'm not sure if I haven't introduced side-effects.

yretenai avatar Feb 09 '24 09:02 yretenai

https://github.com/FabianFG/CUE4Parse/pull/127/commits/7bfb62aa7fe7ebfb0c8fb8c8e580792b1cf5e9af adds two new options in Globals:

  • AllowLargeFiles defaulting to false, which will prevent large ubulk and uptnl files from loading. This will reasonably error when processing embedded vertex streams (landscape proxies) and textures (usually heightmaps, weightmaps) in maps. Should it be better to just return null?
  • LargeFileLimit controls what is considered a large file (currently set to 2 GB, should it be lower?)

Large files as determined by LargeFileLimit will now also always use the chunk reader.

yretenai avatar Feb 10 '24 09:02 yretenai

added HasValidSize which will validate the size.

yretenai avatar Feb 11 '24 15:02 yretenai

I got out of memory with this patch on extract Palworld' umap file "Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5"

                var allExports = provider.LoadAllObjects(path);
                var fullJson = JsonConvert.SerializeObject(allExports, Formatting.Indented);
                File.WriteAllText(mapJsonPath, fullJson);

and get out of memory on extract "Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5/_Generated_/", the original version is able to extract this directory.

Chuanhsing avatar Feb 20 '24 08:02 Chuanhsing

I got out of memory with this patch on extract Palworld' umap file "Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5"

this PR would only enable reading of large files, you still would end up using a large amount of memory if you were to serialize it to a JSON file.

and get out of memory on extract "Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5/_Generated_/", the original version is able to extract this directory.

the new code shouldn't run on any of the Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5/_Generated_/ umaps as they're all under 2 GB. i'll check again later, it could be that my refactors to decompression broke something.

yretenai avatar Feb 21 '24 01:02 yretenai

the new code shouldn't run on any of the Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5/_Generated_/ umaps as they're all under 2 GB. i'll check again later, it could be that my refactors to decompression broke something.

Provide more information. It stop at Pal/Content/Pal/Maps/MainWorld_5/PL_MainWorld5/_Generated_/MainGrid_L0_X-15_Y-3_DL0 and normally it will generate 7.53MB json. Computer has 60GB free memory before run this program. In Globals.cs, both AlwaysUseChunkedReader and AllowLargeFiles set to true.

Chuanhsing avatar Feb 22 '24 23:02 Chuanhsing