Ready for testing 🧪 Multi-policy training support

Open Ivan-267 opened this issue 1 year ago • 1 comments

Apr 01 '24 17:04 Ivan-267

I've modified the onnx inference to handle different action spaces, and to handle models exported with Rllib differently than SB3 exported models by the following process:

1) Check the output size of the onnx model
2) IF all actions are continuous: 
      IF output size of onnx model matches total continuous action size set in Godot: Read all action values
      ELSE: Skip action logstd values

There's still testing needed as I might have overlooked something or caused some regression with the code, otherwise, this should finally bring full support for multi-discrete, continuous and hybrid actions for models trained with Rllib, and keep the continuous only action support for SB3 for now.

Edit: I briefly tested this in the JumperHard env with a case of only continuous actions, only discrete actions, and mixed as following:

func get_action_space():
	return {
		"jump": {"size": 2, "action_type": "discrete"},
		"move": {"size": 1, "action_type": "continuous"},
		"turn": {"size": 3, "action_type": "discrete"}
	}

Apr 05 '24 18:04 Ivan-267