returnn icon indicating copy to clipboard operation
returnn copied to clipboard

`memory_limit_bytes` AttributeError

Open JackTemaki opened this issue 3 years ago • 9 comments

This seems to be related to 76c3c1c5132e4a4792435d9d24bd5c03f738d1b1

The error is as follows:

  File "/work/asr4/rossenbach/sisyphus_work_folders/tts_asr_2021_work/i6_core/tools/git/CloneGitRepositoryJob.EwnvQKqmhown/output/repository/returnn/__main__.py", line 347, in init
    line: init_backend_engine()
    locals:
      init_backend_engine = <global> <function init_backend_engine at 0x1491bbc2d430>
  File "/work/asr4/rossenbach/sisyphus_work_folders/tts_asr_2021_work/i6_core/tools/git/CloneGitRepositoryJob.EwnvQKqmhown/output/repository/returnn/__main__.py", line 316, in init_backend_engine
    line: tf_util.print_available_devices(tf_session_opts=tf_session_opts, file=log.v2)
    locals:
      tf_util = <local> <module 'returnn.tf.util.basic' from '/work/asr4/rossenbach/sisyphus_work_folders/tts_asr_2021_work/i6_core/tools/git/CloneGitRepositoryJob.EwnvQKqmhown/output/repository/returnn/tf/util/basic.py'>
      tf_util.print_available_devices = <local> <function print_available_devices at 0x149158569c10>
      tf_session_opts = <local> {}
      file = <not found>
      log = <global> <returnn.log.Log object at 0x1491bbcd54c0>
      log.v2 = <global> <returnn.log.Stream object at 0x1491585e2f40>
  File "/work/asr4/rossenbach/sisyphus_work_folders/tts_asr_2021_work/i6_core/tools/git/CloneGitRepositoryJob.EwnvQKqmhown/output/repository/returnn/tf/util/basic.py", line 1180, in print_available_devices
    line: print(
            "Hostname %r, GPU %i, GPU-dev-name %r, GPU-memory %s" % (
              util.get_hostname(), dev_id, dev_name, util.human_bytes_size(dev.memory_limit_bytes)), file=file)
    locals:
      print = <builtin> <built-in function print>
      util = <local> <module 'returnn.util.basic' from '/work/asr4/rossenbach/sisyphus_work_folders/tts_asr_2021_work/i6_core/tools/git/CloneGitRepositoryJob.EwnvQKqmhown/output/repository/returnn/util/basic.py'>
      util.get_hostname = <local> <function get_hostname at 0x1491bbca68b0>
      dev_id = <local> 3
      dev_name = <local> 'GeForce GTX 1080 Ti', len = 19
      util.human_bytes_size = <local> <function human_bytes_size at 0x1491bbc9f1f0>
      dev = <local> name: "/device:GPU:0"
                    device_type: "GPU"
                    memory_limit: 10764901440
                    locality {
                      bus_id: 2
                      numa_node: 1
                      links {
                      }
                    }
                    incarnation: 13304672140063075166
                    physical_device_desc: "device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:83:00.0, compute capability: 6.1"

      dev.memory_limit_bytes = <local> !AttributeError: memory_limit_bytes
      file = <local> <returnn.log.Stream object at 0x1491585e2f40>
AttributeError: memory_limit_byte

JackTemaki avatar Nov 15 '22 13:11 JackTemaki

Can you post the full log? Esp, what TF version is relevant here, but also the TF flags, etc.

albertz avatar Nov 15 '22 13:11 albertz

So 4bfcc86c07095faacd6d5650347825c523a29a0f fixed it?

albertz avatar Nov 15 '22 23:11 albertz

I hit the AttributeError: memory_limit_byte when I attempted to run ./rnn.py demos/demo-tf-attention.config on a few days old checkout, but with a git pull to 4bfcc86c it now works. Was there a fix?

braddockcg avatar Nov 16 '22 00:11 braddockcg

So 4bfcc86 fixed it?

Yes, sovled for me with 76c3c1c5132e4a4792435d9d24bd5c03f738d1b1

JackTemaki avatar Nov 16 '22 09:11 JackTemaki

Can you post the full log? Esp, what TF version is relevant here, but also the TF flags, etc.

albertz avatar Nov 16 '22 09:11 albertz

Also, we are using the same attrib in other places. Why doesn't it break there?

albertz avatar Nov 16 '22 09:11 albertz

I do not have the log anymore, TF version was 2.3, and I am not sure what flags you are talking about. I do not have anything TF specific in the env for that setup.

JackTemaki avatar Nov 16 '22 09:11 JackTemaki

But I'm using the same TF version. So then your fix looks wrong? Can you check this?

albertz avatar Nov 16 '22 09:11 albertz

I saw in the documentation that LogicalDeviceConfiguration has this memory_limit property. Is that what you have here? But why? It would really be good to better understand this. Also, as said, we are using this elsewhere as well, so it's not a good idea to just have this fix in one place.

albertz avatar Nov 16 '22 09:11 albertz