godot icon indicating copy to clipboard operation
godot copied to clipboard

[Draft] Add support for nullable types in GDScript

Open joao-pedro-braz opened this issue 1 year ago • 61 comments

Nullable Types in GDScript

This PR aims to add a feature-set to GDScript tailored towards the safe usage of potentially-null expressions. It was initially based off https://github.com/godotengine/godot-proposals/issues/162 and through several discussions over RocketChat it grew to be what it is currently. Further feedback is wholeheartedly appreciated!

TODOs

  • [x] Make the analyzer ternary-aware
  • [x] Catch methods returning the nullable result of other methods
  • [X] ~Convert nullable errors to UNSAFE_* warnings~ (Won't consider for now)
  • [X] Rewrite the way nullables are handled in the VM
  • [X] Fix a off-by-two error when setting a OPCODE_JUMP_IF_NULL for nullable assignments
  • [X] Add the PROPERTY_USAGE_NULLABLE to the PropertyUsageFlags
  • [X] ~Merge https://github.com/godotengine/godot-cpp/pull/1110`~
  • [X] ~Re-enable "Build godot-cpp test extension" before merging~

What it is

This PR is:

  • The addition of the nullable qualifier, an optional type qualifier that indicates whether a given type is nullable (which can be interpreted as a union type of the type itself + null, like: String?, which would be a union type of String | Null). Usages of the nullable qualifier are:

    • During a class member declaration:
      @tool
      extends EditorScript
      
      var class_member: String? = null
      
      func _run():
          class_member = "123"
      
    • During a variable declaration:
      @tool
      extends EditorScript
      
      func _run():
          var nullable_variable: String? = "123"
      
    • During a function parameter declaration:
      func a_fun_function(nullable_parameter: String?):
          pass
      
    • During a function return declaration:
      func another_function() -> String?:
          return null
      
    • As the inner type of a Typed Array:
      func _run():
          var arr: Array[int?] = [1, null]
      
  • The inclusion of nullity safety guards on several expressions to ensure nullable types are handled in a safe manner. These safety guards were added to the following expressions:

    • Binary operators:
      func _run():
          var nullable_str: String? = null
          print(nullable_str + "123") # Unsafe, will cause a parse error (triggered by the analyzer)
          var nullable_int: int? = null
          print(nullable_int - 123) # Unsafe, will cause a parse error (triggered by the analyzer)
      
    • Subscripts:
      class A:
          var cool_value := "cool!"
      func _run():
          var nullable: A? = null
          print(nullable.cool_value) # Usafe, will cause a parse error (triggered by the analyzer)
      
    • Calls:
      class A:
          func cool_function():
              pass
      func _run():
          var nullable: A? = null
          print(nullable.cool_function()) # Usafe, will cause a parse error (triggered by the analyzer)
      
    • Assignments:
      func _run():
          var nullable_str: String? = null
          var non_nullable: String = nullable_str # Usafe, will cause a parse error (triggered by the analyzer)
      
    • Unary Operators:
      func _run():
          var nullable: int? = null
          print(-nullable) # Usafe, will cause a parse error (triggered by the analyzer)
      
    • For-in ranges:
      func _run():
          var nullable: int? = null
          for i in nullable: # Usafe, will cause a parse error (triggered by the analyzer)
              pass
      
    • Returns:
      func a_non_null_function() -> String:
          var nullable: String? = null
          return nullable # Usafe, will cause a parse error (triggered by the analyzer)
      func another() -> String:
          return nullable() # Also unsafe and will cause a parse error (triggered by the analyzer)
      func nullable() -> String?:
          return null
      
  • The implicit narrowing of nullable identifiers, which ensures that nullable identifiers can be safely handled (without safety errors) if properly checked beforehand. This currently applies to the following expressions:

    • If/Elif/Else statements:
      var foo: String? = null
      if foo:
          print(foo.begins_with("")) # Safe
      elif foo != null:
          print(foo.begins_with("")) # Safe
      elif foo == "123":
          print(foo.begins_with("")) # Safe
      elif is_same(foo, "123"):
          print(foo.begins_with("")) # Safe
      elif is_instance_of(foo, TYPE_STRING):
          print(foo.begins_with("")) # Safe
      else:
          print(foo.begins_with("")) # Not Safe
      
    • While statements:
      var foo: int? = null
      while foo:
          foo -= 1 # Safe
      
    • Match statements:
      var foo: String? = null
      match foo:
         "1", "12", "123":
             print(foo.begins_with("1")) # Safe
         null:
             print(foo.begins_with("1")) # Not Safe
         var bar:
             print(bar.begins_with("1")) # Not Safe
      
    • Ternaries:
      var foo: int? = null
      (foo if foo else 1) + 1 # Safe
      (foo if foo else null) + 1 # Not Safe
      
  • The addition of the nullable access operator (?. and ?[]), which ensures subscript accesses are safe even on null values by only accessing the right-hand attribute if the left-hand base is not null, otherwise by returning null:

    • Attribute access:
      func _run():
          var nullable_str: String? = null
          print(nullable_str?.begins_with("")) # This is safe, "begins_with" will only be called/accessed if "nullable_str" is not null 
      
    • Index access:
      func _run():
          var nullable_str: String? = null
          print(nullable_str?[0]) # This is safe, "nullable_str" will only be indexed if it's not null
      
  • The inclusion of the nullable coalescing operator (??), which allows a nullable expression to be coalesced into another expression, essentially allowing nullable expressions to have default values. Both operands must have the same type (but not necessarily the same nullity) and Resource Types (Objects) are also supported:

    class A:
        var cool_property := "123"
    func _run():
        var nullable_str: String? = null
        print((nullable_str ?? "123").begins_with("123")) # This is safe, calling "begins_with" is safe because the expression "(nullable_str ?? "123")" will always return a non-nullable value
        var nullable_instance: A? = null
        print(nullable_instance ?? A.new()) # Also safe
    

What it is not

This PR is not about making the core null-aware or even updating the API docs to reflect instances where null might be a valid argument/return. This PR is also not about making Reference Types (Like Objects) no longer implicitly null by default. Their behavior remains unchanged.

Next steps

Next steps for improvements included:

  • The addition of a Explicit Nullable Narrowing Syntax (ENNS for short), which would explicitly allow for a nullable identifier to be narrowed into a non-nullable one. Proposed syntax so far includes:
    • The if bar ?= foo syntax:
      var foo: int? = null
      if bar ?= foo:
          print(bar) # Bar will only be printed if foo is not null and it will be non-nullable (int)
      
    • The for bar in foo:
      var foo: int? = null
      for bar in foo:
          print(bar) # Bar will only be printed if foo is not null and it will be non-nullable (int)
      
  • ~Converting the nullable related errors to UNSAFE_* warnings as suggested by @dalexeev. (Will do next)~
  • Even though over 50 test cases were added, there is room for more, specifically those ones covering edge cases.

Closes godotengine/godot-proposals#162 Closes godotengine/godot-proposals#1321

joao-pedro-braz avatar May 08 '23 13:05 joao-pedro-braz

Codespell keeps complaining about: image but "Remaning" doesn't seem to even be present in the code base at all. Suggestions?

joao-pedro-braz avatar May 08 '23 13:05 joao-pedro-braz

Codespell keeps complaining about: image but "Remaning" doesn't seem to even be present in the code base at all. Suggestions?

Will be fixed by #76842 in a few minutes (rebasing this PR now on master should also fix it, but if you wait for that PR it's cleaner).

akien-mga avatar May 08 '23 13:05 akien-mga

I set the PR to draft, as discussed in the chat!

adamscott avatar May 08 '23 13:05 adamscott

This is amazing.

Could you clarify what's the result of a nullable access operator when the variable is null? Null, I guess, but just in case.

RandomShaper avatar May 08 '23 14:05 RandomShaper

This is amazing.

Could you clarify what's the result of a nullable access operator when the variable is null? Null, I guess, but just in case.

@RandomShaper Yep, null! Which allows it to be combined with the nullable coalesce operator for a basically infallible attribute access:

class B:
    var cool_property := "inner"
class A:
    var b: B?
func _run():
    var a: A? = null
    # Try to get "cool_property" or return a default value otherwise
    print(a?.b?.cool_property ?? "default")

Actually running it: image

joao-pedro-braz avatar May 08 '23 14:05 joao-pedro-braz

Absolutely amazing PR description like I said in the chat! 🔥🔥🔥

I haven't been able to follow the dev chat fully lately, so this might have already been discussed there. I do think it bears having that discussion in this more public way though, so here we go! :)

As I've said in the chat, I personally generally dislike the implicit narrowing that happens as a side-effect of using control flow statements like if and while. I find that to be a little too subtle for a language that is meant to be straightforward and user-friendly :) I'd much rather have explicit narrowing, as in the handling of union types in functional programming languages (which could help set the stage for union type handling in GDScript later down the line). So something like:

var x : String? = null
case x:
  null: # Return / do error handling / perform default behavior?
  String: # use x as if it was a non-nullable string

One-liner variants of this could also work, e.g. case x: # assume x is not null or casen x: # assume x is null. This approach has the advantage of keeping nullability statements and control flow statements clearly separated.

The if foo syntax feels particularly unsafe due to many non-nullable variants having implicit boolean conversions:

	var x : String? = ""
	
	if x: # Cannot distinguish between x == "" or x == null
		print("not empty")
	else:
		print("empty")
	
	var y : int? = 10
	while y: # Cannot distinguish from y == null or y == 0
		y = y - 1
		# might do something here that results in y == null
		print(y)

And then you have to have expression analysis be aware that the expression, depending on context, may contain implicit narrowing syntax. For example, what happens with the example below? Somewhere in the middle of it there's a lone y or y != null that needs to be parsed and understood as implicit narrowing syntax in the middle of what is otherwise a normal boolean expression.

	var y : int? = 10
	while y > 0 and y*x < x + 4 and y != null and other_weird_function(y):
		do_stuff_with_y_that_must_be_non_null(y)

Doesn't that result in code that's harder to maintain compared to having a statement that explicitly handles nullability (and nothing else like boolean/booleanable expressions)?

I'm worried that newer folks coming into Godot/programming for the first time are gonna get either bit or just very confused by these subtle side-effects of program control flow statements.

Maybe it's just me though! And if the consensus is that there are no issues with user friendliness and code-maintainability, then I'm happy :)

anvilfolk avatar May 08 '23 15:05 anvilfolk

Definitely a valid concern @anvilfolk!

I think there are two points I feel strongly of on the implicit narrowing subject:

  1. Other languages do it that way: Of course this by itself isn't a good enough reason for anything and I acknowledge that. Having said that... it goes to show how that isn't necessarily "confusion material" for newcomers. Languages like Typescript make use of implicit narrowing pretty much in the same way as this PR does: image

  2. At what point does it stop being our fault and become the user's? Kinda weird to say that, but so far the only examples on why the implicit narrowing might be confusing involve either functions with side effects or asynchronous functions with side effects. Both of which are prone to problems by itself (side effects = bad if not purposeful). Shouldn't we then rather focus on letting the user know their function have side-effects (By adding a UNSAFE_* warning as another PR, for instance) and therefore it might behave unexpectedly?

Just my two cents. Also safe to add explicit and implicit are complementary, we could have both if we agree on a particular syntax.

joao-pedro-braz avatar May 08 '23 16:05 joao-pedro-braz

@anvilfolk Forgot to address the if x: # Cannot distinguish between x == "" or x == null but I think that's a no brainer, if the user cares for all non-null values of x they could just do if x != null:.

joao-pedro-braz avatar May 08 '23 17:05 joao-pedro-braz

You need to sync GDExtension with these changes I believe: https://github.com/godotengine/godot/blob/cf8ad12b56df4ae7bba4c73070dd035693a880e4/core/extension/gdextension_interface.h#L137

AThousandShips avatar May 08 '23 18:05 AThousandShips

GDEXTENSION_VARIANT_OP_IN,

@AThousandShips I knew I was missing something, even asked on #gdextension, thank you!

joao-pedro-braz avatar May 08 '23 19:05 joao-pedro-braz

@AThousandShips Weird thing is, I think I did that already:

image

https://github.com/godotengine/godot/pull/76843/files#diff-29b903ebcf0b26a6b2fa3e617462e06b69b6d5201e1dda76095076949da8a0f8R139

joao-pedro-braz avatar May 08 '23 19:05 joao-pedro-braz

Oh sorry missed that, now that you mention it I did see it, then this must be due to lack of sync in godot-cpp, got myself confused looking at the main thread and forgot

AThousandShips avatar May 08 '23 19:05 AThousandShips

Where the PR would truly shine is with dictionaries. dict_1?.dict_2?.dict_3?.value

Unfortunately, dict_1.dict_2 triggers an error if the key dict_2 doesn't exist in dict_1. This is counter intuitive if we can't use the ?. operator on dictionaries. The operator should check if a key exist, otherwise return null without triggering any error.

adamscott avatar May 08 '23 19:05 adamscott

Oh sorry missed that, now that you mention it I did see it, then this must be due to lack of sync in godot-cpp, got myself confused looking at the main thread and forgot

So In that case I would need to have a PR adding the coalesce operator to the godot-cpp merged before merging this one?

joao-pedro-braz avatar May 08 '23 19:05 joao-pedro-braz

That's my confusion, it does seem a very strange situation as neither would work without the other, I might be missing something elsewhere, that's why I got confused as I assumed it must be possible to make changes that doesn't immediately work with godot-cpp

AThousandShips avatar May 08 '23 19:05 AThousandShips

Indeed... @akien-mga is this something you've encountered before?

joao-pedro-braz avatar May 08 '23 19:05 joao-pedro-braz

Looking at how the godot-cpp test is executed I think I really need to get the new operator merged upstream on godot-cpp before this PR can be merged:

      # Dump GDExtension interface and API
      - name: Dump GDExtension interface and API for godot-cpp build
        if: ${{ matrix.godot-cpp-test }}
        run: |
          ${{ matrix.bin }} --headless --dump-gdextension-interface --dump-extension-api
          cp -f gdextension_interface.h godot-cpp/gdextension/
          cp -f extension_api.json godot-cpp/gdextension/

joao-pedro-braz avatar May 08 '23 20:05 joao-pedro-braz

The PR description doesn't make it clear to me whether the behavior of Object types got changed (which would be a breaking compat change, but maybe one worth doing). Are Objects still implicitly nullable by default, or do you need to add Object? to their type hints as well?

RedMser avatar May 09 '23 07:05 RedMser

Looking at how the godot-cpp test is executed I think I really need to get the new operator merged upstream on godot-cpp before this PR can be merged:

      # Dump GDExtension interface and API
      - name: Dump GDExtension interface and API for godot-cpp build
        if: ${{ matrix.godot-cpp-test }}
        run: |
          ${{ matrix.bin }} --headless --dump-gdextension-interface --dump-extension-api
          cp -f gdextension_interface.h godot-cpp/gdextension/
          cp -f extension_api.json godot-cpp/gdextension/

Yes there's a circular dependency between the two. You'd have to first get a PR merged in godot-cpp with the extension interface and API from this godot PR, and then rebase this one.

For now since this is a draft, and likely to undergo significant discussion, review and changes, I would suggest commenting out the godot-cpp test so you can get CI passing. We should just remember to fix it before merging.

akien-mga avatar May 09 '23 08:05 akien-mga

Would it be possible, for confirmation, to redirect the godot-cpp test to point to a fork where the necessary updates are made? To be able to investigate and verify what needs to be updated there?

AThousandShips avatar May 09 '23 08:05 AThousandShips

The PR description doesn't make it clear to me whether the behavior of Object types got changed (which would be a breaking compat change, but maybe one worth doing). Are Objects still implicitly nullable by default, or do you need to add Object? to their type hints as well?

@RedMser Objects are still implicitly nullable by default. This was a topic of debate over RocketChat and the conclusion was essentially the same as yours, changing this behavior will break compatibility and that's undesirable. So using Object? is functionally the same as a plain Object, but it'll at least get you static analysis.

In the future we could introduce as "strict non-nullable qualifier Object!", which could be used to make Reference types (Like Objects) behave like builtins and primitives, but that's another discussion.

joao-pedro-braz avatar May 09 '23 08:05 joao-pedro-braz

A topic @dsnopek brought to my attention over #gdextension on RocketChat is that it might be necessary to add the new Coalesce Operator as a method of Variant itself. Could someone from Core chime in to confirm that?

joao-pedro-braz avatar May 09 '23 08:05 joao-pedro-braz

Looking at how the godot-cpp test is executed I think I really need to get the new operator merged upstream on godot-cpp before this PR can be merged:

      # Dump GDExtension interface and API
      - name: Dump GDExtension interface and API for godot-cpp build
        if: ${{ matrix.godot-cpp-test }}
        run: |
          ${{ matrix.bin }} --headless --dump-gdextension-interface --dump-extension-api
          cp -f gdextension_interface.h godot-cpp/gdextension/
          cp -f extension_api.json godot-cpp/gdextension/

Yes there's a circular dependency between the two. You'd have to first get a PR merged in godot-cpp with the extension interface and API from this godot PR, and then rebase this one.

For now since this is a draft, and likely to undergo significant discussion, review and changes, I would suggest commenting out the godot-cpp test so you can get CI passing. We should just remember to fix it before merging.

@akien-mga Alright, I'll comment it out and update the PR's description to ensure we remember to fix it!

joao-pedro-braz avatar May 09 '23 08:05 joao-pedro-braz

Where the PR would truly shine is with dictionaries. dict_1?.dict_2?.dict_3?.value

Unfortunately, dict_1.dict_2 triggers an error if the key dict_2 doesn't exist in dict_1. This is counter intuitive if we can't use the ?. operator on dictionaries. The operator should check if a key exist, otherwise return null without triggering any error.

That shouldn't be hard to implement and it seems (at least for me) the expected behavior. What do the others think?

joao-pedro-braz avatar May 09 '23 09:05 joao-pedro-braz

GHA is now passing!

joao-pedro-braz avatar May 09 '23 10:05 joao-pedro-braz

I am fully down to retract a lot of my concerns if this is just how nullable types work in other languages and everybody with experience with them seems fine with it, hehehe :)

I'd love some thoughts on the while y > 0 and y*x < x + 4 and y != null and other_weird_function(y): situation! Would implicit nullability stuff only work when while y != null, or does it already work with more complex expressions? What would happen with var x = y > 0 and y*x < x + 4 and y != null and other_weird_function(y), followed by if x: #something?

At this point it's mostly curiosity since not a ton of people seem to have issues with it :)

anvilfolk avatar May 09 '23 11:05 anvilfolk

In this case while y > 0 and y*x < x + 4 and y != null and other_weird_function(y): triggers a safety error because a binary operation with nullable operands is unsafe (i.e., y > 0)

joao-pedro-braz avatar May 09 '23 12:05 joao-pedro-braz

@anvilfolk For reference: image

joao-pedro-braz avatar May 09 '23 12:05 joao-pedro-braz

In this case while y > 0 and y*x < x + 4 and y != null and other_weird_function(y): triggers a safety error because a binary operation with nullable operands is unsafe (i.e., y > 0)

That makes a ton of sense! It looks like you can't really mix the booleanable expressions since any use of y will generate an error because you're not checking for nullability first. All my worries about this particular issue have flown awayyyyyy ✈️

Thank you for taking time to explain!

anvilfolk avatar May 09 '23 14:05 anvilfolk

Experimented with replacing the current nullable related errors for UNSAFE_* warnings and the results were worrying. Code like this shouldn't be allowed to compile IMO:

func test():
   for element in nullable_array(): # We might end up iterating over null
       print(element)

func nullable_array() -> Array?:
   return null

But by making them UNSAFE_* warnings the user can end up ignoring them, which will error at runtime.

joao-pedro-braz avatar May 09 '23 14:05 joao-pedro-braz