aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

node hash includes computer uuid. why?

Open ltalirz opened this issue 4 years ago • 9 comments

While playing around with mock codes, I ran into a couple of issues related to node hashing:

the hash of any Node currently includes self.computer.uuid (if they have an associated computer).

https://github.com/aiidateam/aiida-core/blob/358b9167cfc64ad6bcff0b13837c3295aded0d2c/aiida/orm/nodes/node.py#L1140

I have two questions here:

  1. Hashes are meant to look at the content of nodes. UUIDs should never be part of the hash, correct? I guess we need to introduce a hash for computers.
  2. Why was the computer hash introduced to the hash in the first place? I imagine it was introduced for Codes (?), where a hash of the computer should be part of the hash of the code. For CalcJob nodes, however, (which also have a Node.computer), this information actually does not need to be included in the hash, since this is already taken care of by the hash of the code in the inputs. Can we move the computer hash to the Node subclasses for which it is intended?

the readability of the output provided by Node._get_objects_to_hash currently varies a lot:

['1.1.1', 
 {'version': {'core': '1.1.1', 'plugin': '1.1.1'}, 
  'withmpi': False, 
  'resources': {'num_machines': 1, 'tot_num_mpiprocs': 1, 'num_mpiprocs_per_machine': 1}, 
  'append_text': '', 
  'parser_name': 'zeopp.network', 
  'prepend_text': '', 
  'scheduler_stderr': '_scheduler-stderr.txt', 
  'scheduler_stdout': '_scheduler-stdout.txt', 
  'mpirun_extra_params': [], 
  'environment_variables': {}, 
  'import_sys_environment': True, 
  'custom_scheduler_commands': ''
 },
'57473b64-d400-4cb3-b99b-4735c0cd5a36', 
 {'code': '4cbbb85106e3352e04b4aaf1a4896e962cd1b17f857728a1ba801f6aabc1d394',
   'parameters': '5cdad2c1da524984832893d7436e695f3326f9d51db679aef4c845227a3b620c', 
   'structure': '364e7867f49e1bf6279b4c94832c965051a1cf8022ff4860df9e158a45165667'
 }
]

While some items are nicely documented using dictionaries, others (the version + the computer uuid) are just dumped in "bare", making it difficult to understand where they came from.

Could we agree to move to a practise where all items in this list are wrapped in dictionaries that provide some context?

mentioning @greschd and @sphuber for comment

ltalirz avatar Mar 29 '20 01:03 ltalirz