godot icon indicating copy to clipboard operation
godot copied to clipboard

--convert-3to4 hangs. Also fails on files >500kb

Open TokisanGames opened this issue 1 year ago • 3 comments

Godot version

5c6744ae54189869580d713a6c31caa90492838a

System information

Win 10/64 NVIDIA GeForce GTX 1060/PCIe/SSE2

Issue description

6GB repo, 5100 files (assets, code, scenes). Now that #63377 is fixed, I ran the conversion again on my project. It identified 2011 files to process.

First I started with --validate-conversion-3to4, then went on to --convert-3to4. Both had the same behavior.

1 - 500kb Limit

There should be no such limit. If the editor can read such files, the conversion tool should be able to. Other conversion scripts can process them fine. I have plenty of large, text based assets and scenes. Binaries are no good for frequently changing files stored in git.

Checking for conversion - 1001/2011 file - "assets/models/rocks/mountain_pack/meshes/Mountain_rock_big_02_1.tscn" with size - 4153 KB
    Checking file took 0 ms.
        ERROR: File has exceeded the maximum size allowed - 500 KB

Also the check isn't even consistent. It's ok with this file >500kb, but not others:

Trying to convert       568/2011 file - "assets/models/village/fireplaces/FireplaceTop.tscn" with size - 1227 KB
    File was changed, conversion took 11848 ms.

2 - Hanging

It was chugging away fine validating everything, skipping anything above 500kb. Then it hit some file that it should have skipped due to the above limit and is hanging at 100% CPU on one thread. 220mb used.

Checking for conversion - 1000/2011 file - "assets/models/rocks/mountain_pack/meshes/Mountain_rock_big_01_5.tscn" with size - 1440 KB
    Checking file took 4557 ms.
                Line (75) CollisionShape -> CollisionShape3D  -  LINE """ [node name="CollisionShape" type="CollisionShape" parent="."] """
                Line (54) ConcavePolygonShape -> ConcavePolygonShape3D  -  LINE """ [sub_resource type="ConcavePolygonShape" id=4] """
                Line (61) ImmediateGeometry -> ImmediateMesh  -  LINE """ [node name="RootNode" type="ImmediateGeometry" parent="."] """
                Line (64) MeshInstance -> MeshInstance3D  -  LINE """ [node name="mountain_rock_big_01_5_LOD0" type="MeshInstance" parent="RootNode"] """
                Line (67) MeshInstance -> MeshInstance3D  -  LINE """ [node name="mountain_rock_big_01_5_LOD1" type="MeshInstance" parent="RootNode"] """
                Line (71) MeshInstance -> MeshInstance3D  -  LINE """ [node name="mountain_rock_big_01_5_LOD2" type="MeshInstance" parent="RootNode"] """
                Line (57) StaticBody -> StaticBody3D  -  LINE """ [node name="Mountain_rock_big_01_5" type="StaticBody"] """
                Line (10) PoolByteArray -> PackedByteArray  -  LINE """ "array_data": PoolByteArray( 196, 23, 214, 190, 58, 231, 20, 190, 84, 38, 94, 190, 146, 199, 25, 0, 213, 107, 51, 127, 255, 255, 255, 255, 4, 44, 69, 53, 94, 46, 154, 46, 43, 116, 195, 190, 215, 155, 90, 190, 252, 89, 89, 190, 146, 199, 25, 0, 213, 107, 51, 127, 255, 255, 255, 255, 48, 41, 118, 53, 57, 44, 192, 47, 143, 146, 216, 190, 35, 153, 32, 190, 57, 13, 135, 190, 146, 199, 25, 0, 213, 10... """
                Line (11) PoolByteArray -> PackedByteArray  -  LINE """ "array_index_data": PoolByteArray( 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41, 0, 42, 0, 43, 0, 44, 0, 45, 0, 46, 0, 47, 0, 48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 5... """
                Line (26) PoolByteArray -> PackedByteArray  -  LINE """ "array_data": PoolByteArray( 47, 205, 140, 190, 125, 132, 86, 190, 60, 3, 181, 190, 31, 2, 134, 0, 118, 223, 29, 127, 255, 255, 255, 255, 205, 54, 65, 53, 103, 57, 100, 53, 151, 152, 83, 190, 71, 145, 11, 190, 194, 29, 171, 190, 31, 2, 134, 0, 118, 223, 29, 127, 255, 255, 255, 255, 10, 55, 182, 53, 191, 57, 9, 53, 214, 216, 156, 190, 211, 21, 32, 190, 55, 143, 184, 190, 31, 2, 134, 0, 118, 223,... """
                Line (27) PoolByteArray -> PackedByteArray  -  LINE """ "array_index_data": PoolByteArray( 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41, 0, 42, 0, 43, 0, 44, 0, 45, 0, 46, 0, 47, 0, 48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 5... """
                Line (42) PoolByteArray -> PackedByteArray  -  LINE """ "array_data": PoolByteArray( 175, 165, 209, 58, 138, 35, 212, 188, 6, 9, 143, 190, 23, 2, 132, 0, 120, 222, 21, 127, 255, 255, 255, 255, 236, 55, 160, 54, 112, 58, 103, 51, 58, 107, 146, 62, 146, 64, 25, 190, 91, 244, 105, 190, 23, 2, 132, 0, 120, 222, 21, 127, 255, 255, 255, 255, 200, 56, 104, 54, 69, 58, 229, 44, 160, 110, 131, 62, 33, 237, 249, 60, 80, 97, 107, 190, 23, 2, 132, 0, 120, 222, ... """
                Line (43) PoolByteArray -> PackedByteArray  -  LINE """ "array_index_data": PoolByteArray( 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41, 0, 42, 0, 43, 0, 44, 0, 45, 0, 46, 0, 47, 0, 48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 5... """
                Line (55) PoolVector3Array -> PackedVector3Array  -  LINE """ data = PoolVector3Array( 0.0016, -0.0259, -0.2794, 0.286, -0.1497, -0.2285, 0.2567, 0.0305, -0.2299, 0.1979, -0.3036, 0.366, -0.3022, -0.2952, -0.252, -0.3018, -0.3007, 0.0349, -0.4746, -0.0009, 0.0372, -0.4267, -0.0882, 0.1183, -0.4991, 0.0595, 0.017, 0.1411, 0.1554, -0.2544, 0.1983, 0.0884, -0.2179, 0.2583, 0.1204, -0.0577, -0.4107, 0.1271, 0.0184, -0.4746, -0.0009, 0.0372, -0.4991, 0.0595, 0... """
Checking for conversion - 1001/2011 file - "assets/models/rocks/mountain_pack/meshes/Mountain_rock_big_02_1.tscn" with size - 4153 KB
    Checking file took 0 ms.
        ERROR: File has exceeded the maximum size allowed - 500 KB
Checking for conversion - 1002/2011 file - "assets/models/rocks/mountain_pack/meshes/Mountain_rock_big_02_2.tscn" with size - 3878 KB

After about 30+ minutes I finally hit ctrl+c and saw this:

ERROR: BUG: Unreferenced static string to 0: ShaderCompilation
   at: unref (core/string/string_name.cpp:129)
ERROR: BUG: Unreferenced static string to 0: interface_added
   at: unref (core/string/string_name.cpp:131)
ERROR: BUG: Unreferenced static string to 0: bus_layout_changed
   at: unref (core/string/string_name.cpp:131)

Structurally, there's no difference between Mountain_rock_big_02_1.tscn and Mountain_rock_big_02_2.tscn, and the next one _3 in case it started that before printing. I diffed them and they only differ by names or array data. They are static bodies as the root, an immediate geometry w/ a LOD script, three LOD mesh instances as arraymeshes, and a collision shape w/ a ConcavePolygonShape (trimesh generated off of LOD1 or 2. About the same number of vertices at each LOD level. As you see above the second file is smaller. _3 is between 1 and 2 at 4mb.

Here's the scene. Mountain_rock_big_02_2.zip

Here is the full conversion log so far. convert_log.zip

Testing what was converted

Finally, I opened up the half converted project to see if it is any better than my current conversion process. I'm especially interested in anything that might fix #63550.

After a couple hours with my own conversion process I got my error count on open under 100. After this half finished conversion, my error count is 2862. It's not a fair comparison, just a data point.

I looked at some files that were reported as converted and here's what I found:

  • Shaders - not converted properly, opened a separate issue here #63673

  • Mesh conversion - No different than the corrupted footbridge and support beam meshes shown in #63550

@qarmin

TokisanGames avatar Jul 30 '22 10:07 TokisanGames

  1. Message was wrong, For now there is a hard limit 4MB for file, because converting bigger files can take a lot of more time than expected(several minutes or hours) Message is fixed in new PR(not merged yet)

Such files like provided one, was a reason why I added such limit. In this file there is one line with more than 2 millions of characters, which probably causes problems with some regexes To fix issue, probably extracting arraymesh to .res files should help. From converter perspective, I think that lines greater than 10K lines could be probably ignored in converting(of course with appropriate message)

qarmin avatar Aug 03 '22 20:08 qarmin

Can you add an option to not care about file size?

fire avatar Aug 04 '22 13:08 fire

For now there is a hard limit 4MB for file, because converting bigger files can take a lot of more time than expected(several minutes or hours)

Godot can read and load much, much larger files just fine. My player character scene is 360mb with several meshes, 200 animations, and a complex animation tree, all stored as text. Only materials and textures are separate. IMO, you should not set these arbitrary limits. If Godot can handle it, so should the conversion tool.

From converter perspective, I think that lines greater than 10K lines could be probably ignored in converting(of course with appropriate message)

I don't think arbitrarily converting some lines and skipping others is a good idea for scene files or resources. Many people do not edit text scene or resource files. I'd say either pass or reject the whole file, unless it is a shader or gdscript.

For me individually, I've made significant process manually converting (aided by a script), so I'm just here to give feedback based on a real and large project to help improve the tool for future conversions once GD4 is stable.

TokisanGames avatar Aug 06 '22 20:08 TokisanGames

In #64396 I added option to change default line length and file size limits and also everything should work faster, so freezes should happen much less frequently(if any occur, of course).

I don't think arbitrarily converting some lines and skipping others is a good idea for scene files or resources. The only place where I saw lines bigger than 10K characters, were lines with PoolByteArrays which are automatically converted by Godot during loading/saving resources, so I don't think that this will be a problem for users.

Converter cannot work on same amount of data like Godot, so this limits are needed(now user can set how many files want to convert and how much can wait for results), because when loading tres/tscn files, Godot needs only to parse data and create objects basing on this data but converter needs to run on each line hundreds/thousands of rules so it is expected that converter won't handle as big files as Godot.

Additionally storing such big amount of data in text files(like tres or tscn) is highly discouraged due e.g. lower speed of loading them(I remember PR which showed dialog when user tried to save to text file, with advice to save it with e.g. res extension, but I can't find it now).

qarmin avatar Aug 15 '22 18:08 qarmin

Additionally storing such big amount of data in text files(like tres or tscn) is highly discouraged due e.g. lower speed of loading them(I remember PR which showed dialog when user tried to save to text file, with advice to save it with e.g. res extension, but I can't find it now).

I appreciate the concern, however this is not good advice for teams using git, like ours. Git does not do binary diffs, so working with frequently changing binary files in git is a poor choice. Godot does not behave well with developer's files, and constantly changes scenes and resources even when no change is made by users. Since the files are binary, we can't see what the actual changes are with git, so we have to either blindly upload them or blindly discard them. Neither is good.

Godot changes files if you have resource A, which is tied to resource B (eg a material) and you move or rename resource B. Or the smallest change of scene format, say a new line added or removed, which has happened continuously across almost all minor versions of Godot over the last few years (even in the "stable" 3.x branch).

I've seen adjustments to binary materials, animations, resources and scenes show up in my modified list every day. Godot constantly changes files even when we do not. I blew through one repository limited to 10gb, had to purge a lot of binary files and move my team over to a new repot. It got so bad, I had enough and converted every resource and scene to text except textures. Now I can see what changes are actually being made, and can make a conscious decision as to whether they are needed in the repo. When Godot decides to change things, uploads now are a few 100 bytes difference instead of 10-100MB.

Godot has a convert text resources to binary option on export, so text loading speed is not a concern except during development, and only with the very largest of files.

So from the experience of a game developer, not an engine developer, I discourage you and other engine devs from discouraging the use of text files for development. You should be encouraging every developer to use Git. And anyone using Git should be using text files for everything except textures, binary 3D objects that are inherited (likely rarer than the engine devs think) or other special files. I would only consider using binary resources when Godot stops changing scene and resource formats in every release, and is better designed for Git based projects.

TokisanGames avatar Aug 15 '22 20:08 TokisanGames

To wit, other version control mechanism like SVN also have problems with binary files.

Zireael07 avatar Aug 16 '22 10:08 Zireael07

To wit, other version control mechanism like SVN also have problems with binary files.

Yes. The only one I know of that does binary diffs is perforce, which is why it's popular among UE teams. UE has the same problem Godot does: Index information and asset content are in the same file. Godot has the benefit of a text based option. It might be better for revision control for both engines if index information (materials, external resources, etc) was in a text import file, then content could be saved in a binary file that never changes. But Godot/text + convert to binary on export is adequate.

TokisanGames avatar Aug 16 '22 10:08 TokisanGames