openage icon indicating copy to clipboard operation
openage copied to clipboard

Python 3.8 Multiprocessing issue / converting media stuck on graphics/35.slp

Open ericdude4 opened this issue 5 years ago • 29 comments
trafficstars

MacOS Catalina 10.15.1 Assets: Age of Empires 2 Conquerors

The conversion seems to work until it gets to graphics/35.slp and then it just hangs

ericdude4 avatar Mar 23 '20 18:03 ericdude4

Huh, that's weird. For clarification: Does 35.slp still get converted (i.e. there's a 35.slp.png file in assets/converted/graphics/) or does it only convert up to that point?

heinezen avatar Mar 24 '20 10:03 heinezen

@heinezen Oddly, although the conversion output seems to imply that some .slp files are being successfully imported, there doesn't seem to be a assets/converted/graphics/ directory.

Here's the output:

...
INFO [py] [1331/3605] sounds/6442.wav
INFO [py] [1332/3605] sounds/6443.wav
INFO [py] [1333/3605] sounds/6444.wav
INFO [py] [1334/3605] sounds/6445.wav
INFO [py] [1335/3605] graphics/2.slp
INFO [py] [1336/3605] graphics/5.slp
INFO [py] [1337/3605] graphics/8.slp
INFO [py] [1338/3605] graphics/9.slp
INFO [py] [1339/3605] graphics/12.slp
INFO [py] [1340/3605] graphics/21.slp
INFO [py] [1341/3605] graphics/22.slp
INFO [py] [1342/3605] graphics/23.slp
INFO [py] [1343/3605] graphics/24.slp
INFO [py] [1344/3605] graphics/33.slp
INFO [py] [1345/3605] graphics/34.slp
INFO [py] [1346/3605] graphics/35.slp

Heres the contents of the assets/converted directory:

blending_modes.docx   gamedata              player_palette.docx   string_resources.docx
blendomatic           gamespec_version      sounds                termcolors.docx

I have a 6 core machine, and I see that the conversion happens in parallel. I am seeing that there are 12 .slp files here which begin conversion. With hyperthreading on 12 cores that makes sense. I'm guessing that none of the .spl files are converting.

I should add that I am using an old portable version of AoE2:Conquerors that I used to LAN with friends in high school. However, looking in the data directory - everything seems to be there.

ericdude4 avatar Mar 24 '20 11:03 ericdude4

Can you invoke the conversion with more verbosity (-v or -vv) please? Then we should see where the process gets stuck.

TheJJ avatar Mar 24 '20 11:03 TheJJ

If there's no graphics directory, then this is probably caused by the multi-core conversion routine. ~~Does your processor have 6 cores/12 threads?~~ Edit: Nevermind, you already mentioned that

heinezen avatar Mar 24 '20 12:03 heinezen

Wow! I just noticed that the interface.drs - in my game's data directory is spelled interfac.drs 🤦‍♂

Edit: Tried this, seems that interfac.drs is the correct spelling

ericdude4 avatar Mar 24 '20 12:03 ericdude4

@TheJJ Is there a convert command which I can append those flags to? I am using make run and the -vv flags just show me version info

ericdude4 avatar Mar 24 '20 12:03 ericdude4

Use bin/run convert --help for a list of commands, e.g. bin/run convert -v

heinezen avatar Mar 24 '20 12:03 heinezen

This issue might be related to #1091 (on Windows).

heinezen avatar Mar 24 '20 12:03 heinezen

With -v:

INFO [py] [1332/3605] sounds/6443.wav
 DBG [T3] [py] Wavefile: Total length: 54392	Header length: 44	Trailer length: 76
 DBG [T3] [py] Resampling necessary: input rate: 22050	coding rate: 24000
 DBG [T3] [py] Resampled data length: 59070
 DBG [T3] [py] Starting encoding loop.
INFO [py] [1333/3605] sounds/6444.wav
 DBG [T12] [py] Wavefile: Total length: 14834	Header length: 44	Trailer length: 76
 DBG [T12] [py] Resampling necessary: input rate: 22050	coding rate: 24000
 DBG [T12] [py] Resampled data length: 16014
 DBG [T12] [py] Starting encoding loop.
INFO [py] [1334/3605] sounds/6445.wav
INFO [py] [1335/3605] graphics/2.slp
INFO [py] [1336/3605] graphics/5.slp
INFO [py] [1337/3605] graphics/8.slp

Looks like that last 6445.wav file doesn't have any debug info following it. 🤷‍♂

ericdude4 avatar Mar 24 '20 12:03 ericdude4

Can you try the workaround from the top post #1091 and see if that changes anything? The file mentioned should be bin/openage/convert/slp_converter_pool.py

heinezen avatar Mar 24 '20 12:03 heinezen

Here is the contents of my data directory:

BlkEdge.Dat       TileEdge.Dat      gamedata_COPY.drs list.crx
FilterMaps.dat    blendomatic.dat   gamedata_x1.drs   shadow.col
LoQMaps.dat       closedpw.exe      graphics.drs      sounds.drs
Load              empires2.dat      interfac.drs      sounds_x1.drs
PatternMasks.dat  empires2_x1.dat   lightMaps.dat     terrain.drs
STemplet.dat      gamedata.drs      list.cr           view_icm.dat

ericdude4 avatar Mar 24 '20 12:03 ericdude4

@heinezen Sorry, I don't completely follow. I see that the has_x1_p1 function was merged into master a while ago. There is no suggestion in the top post. Did you send a link to the correct thread?

ericdude4 avatar Mar 24 '20 12:03 ericdude4

ups sry, I had the numbers of the issue flipped. i wanted to mention #1091 :D

heinezen avatar Mar 24 '20 12:03 heinezen

Cool! Progress. Making that change gave me a stack trace:

Traceback (most recent call last):
  File "run.py", line 15, in init run
    main()
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/__main__.py", line 129, in main
    return args.entrypoint(args, cli.error)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/main.py", line 644, in main
    if not convert_assets(outdir, args, srcdir):
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/main.py", line 192, in convert_assets
    for current_item in convert(args):
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 120, in convert
    yield from convert_media(args)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 291, in convert_media
    yield from concurrent_chain(
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/util/threading.py", line 48, in concurrent_chain
    raise value
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/util/threading.py", line 106, in generator_to_queue
    for item in generator:
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 419, in convert_mediafile
    convert_slp(filepath, dirname, names_map, converter_pool, args)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 360, in convert_slp
    texture = converter_pool.convert(indata, cutter)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/slp_converter_pool.py", line 102, in convert
    return Texture(SLP(slpdata), self.palette, custom_cutter=custom_cutter)
  File "openage/convert/slp.pyx", line 138, in openage.convert.slp.SLP.__init__
    spam(frame_info)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 101, in spam
    logging.spam(msg, *args, **kwargs)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 68, in _spam
    self._log(SPAM, msg, args, **kwargs)  # pylint: disable=W0212
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1577, in _log
    self.handle(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1587, in handle
    self.callHandlers(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
    hdlr.handle(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 950, in handle
    self.emit(record)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 45, in emit
    log_cpp(cpp_level, record.getMessage(), record.filename, record.funcName, record.lineno)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "openage/convert/slp.pyx", line 208, in openage.convert.slp.FrameInfo.__repr__
    "% 4d" % self.version,
TypeError: %d format: a number is required, not bytes

In the concurrent environment, this crash probably didn't stop the overall process.

ericdude4 avatar Mar 24 '20 12:03 ericdude4

This is an error, but I don't think it's the one that caused your initial crash. Can you run again without the -v flag?

heinezen avatar Mar 24 '20 15:03 heinezen

Compilation worked! I ran into another issue now. It may be that my version of AoE is just messed. I feel bad for taking up your time for my probably one-off issue, so I am willing to give up. However, here is the stack trace I am getting: (I am not a python guy)

Traceback (most recent call last):
  File "run.py", line 15, in init run
    main()
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/__main__.py", line 129, in main
    return args.entrypoint(args, cli.error)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/game/main.py", line 78, in main
    return run_game(args, root)
  File "openage/game/main_cpp.pyx", line 47, in openage.game.main_cpp.run_game
    result = run_game_cpp(args_cpp)
  File "", line 0, in __pyx_pw_7openage_4game_8main_cpp_1run_game(_object*, _object*, _object*)+0xd71
  File "", line 0, in openage::run_game(openage::main_arguments const&)+0x184
  File "", line 0, in openage::Engine::Engine(openage::util::Path const&, int, bool, char const*)+0xff6
  File "", line 0, in openage::gui::GUI::GUI(SDL_Window*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, openage::gui::EngineQMLInfo*)+0x60d
  File "", line 0, in openage::shader::Program::link()+0x39
  File "", line 0, in openage::shader::Program::check(unsigned int)+0x18d
  File "", line 0, in openage::error::Error::Error(openage::log::message const&, bool, bool)+0x111
  File "libopenage/shader/program.cpp", line 97, in void openage::shader::Program::check(GLenum)
    auto errormsg = MSG(err);
openage.cppinterface.exctranslate.CPPException: Program validation failed
Validation Failed: Current draw framebuffer is invalid.

ericdude4 avatar Mar 24 '20 15:03 ericdude4

Looks like you've reached #1207 which is still not solved unfortunately :/ I fixed your other bug with the version output though in #1224.

For the multiprocessing issue: Don't know why it locks up on Windows and macOS, but still works on Linux. This might be an upstream issue.

heinezen avatar Mar 24 '20 15:03 heinezen

if #1222 isn't fixed as you are saying, I can create an issue with my stack trace included. Would love to get this game running on my machine :) Major opportunities for some AoE games w/ friends during this COVID isolation.

ericdude4 avatar Mar 24 '20 16:03 ericdude4

That would be great! :) Macs are always hard to test on because I don't own one .

(Also the issue I wanted to refer to was #1207. Me and numbers don't go well today.)

heinezen avatar Mar 24 '20 16:03 heinezen

Closing in favour of https://github.com/SFTtech/openage/issues/1091 for the concurrency related issue, and https://github.com/SFTtech/openage/issues/1207 for the Mac execution error

ericdude4 avatar Mar 24 '20 16:03 ericdude4

@ericdude4 By any chance, were you using Python 3.8.2 to compile and run openage?

simonsan avatar Apr 18 '20 00:04 simonsan

@ericdude4 By any chance, were you using Python 3.8.2 to compile and run openage?

@simonsan I was using Python 3.8.0

ericdude4 avatar Apr 18 '20 02:04 ericdude4

I could think of that this issue is more related to Python 3.8.0 due to something like on macOS, the spawn start method is now used by default in multiprocessing

https://www.python.org/downloads/release/python-382/

@ericdude4 Could you try using Python 3.7.x and see if it doesn't work as well just to be sure?

simonsan avatar Apr 19 '20 23:04 simonsan

Weird.. If this is indeed the case, then I don't think it's a bug in our code. Just changing the Python interpreter's start method shouldn't change the behavior. https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods Let's see if we can figure this out. @ericdude4 we count on your help :D

TheJJ avatar Apr 20 '20 10:04 TheJJ

ericdude4 to the rescue! Getting the following error on make on current master branch:

➜  openage git:(master) ✗ make
[  0%] openage.codegen: generating c++ code
[  0%] Automatic MOC for target libopenage
[  1%] Building CXX object libopenage/CMakeFiles/libopenage.dir/versions/versions.cpp.o
/Users/ericfroese/Downloads/openage/libopenage/versions/versions.cpp:3:10: fatal error: 'gnu/libc-version.h' file not found
#include <gnu/libc-version.h>
         ^~~~~~~~~~~~~~~~~~~~
1 error generated.
make[3]: *** [libopenage/CMakeFiles/libopenage.dir/versions/versions.cpp.o] Error 1
make[2]: *** [libopenage/CMakeFiles/libopenage.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [build] Error 2

Getting this error with python 3.7 and 3.8 Following these instructions as usual: https://github.com/SFTtech/openage/blob/master/doc/build_instructions/os_x_10.14_mojave.md

ericdude4 avatar Apr 20 '20 13:04 ericdude4

@ericdude4 Can you build the commit before the last one, we're currently working on fixing that. :-)

simonsan avatar Apr 20 '20 14:04 simonsan

@simonsan It's still producing the same multithreading error with Python 3.7.0, however, when I run the ./configure script, I am getting this info at the end of the output (even though python --version outputs 3.7.0)

openage 0.4.0.125

   version string | v0.4.0-125-g6467e268
         compiler | Clang 8.0.1
       build type | Release
         cxxflags |  -stdlib=libc++  -Wall -Wextra -pedantic -Wno-gnu-statement-expression
 build type flags | -DNDEBUG -O3
        build dir | /Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none
   install prefix | /usr/local
py install prefix | /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages

(notice py install prefix | /Library/Frameworks/Python.framework/Versions/3.8/...) Is there somewhere I need to clear a cache or something? Unfortunately not a Python guy

ericdude4 avatar Apr 20 '20 14:04 ericdude4

You could try doing make mrproperer to be sure everything is clean and then run configure again. Let's see if py install prefix changes then

simonsan avatar Apr 20 '20 14:04 simonsan

Yep! It worked: py install prefix | /Users/ericfroese/.pyenv/versions/3.7.0/lib/python3.7/site-packages

Also, the conversion seems to have worked properly now after switching to python 3.7 🥳

INFO [py] asset conversion complete; asset version: 7

ericdude4 avatar Apr 20 '20 15:04 ericdude4

Multiprocessing was replaced with a different implementation a while ago, so this bug shouldn't conme up anymore.

heinezen avatar Sep 08 '23 21:09 heinezen