graphql-ruby
graphql-ruby copied to clipboard
String literals containing `\u000A` should fail to parse
Describe the bug
At present, the GraphQL Ruby language parser does not handle inline string literals according to spec, because it permits \u0004 (newlines) within single-quoted strings. This is a significant disagreement when aligning with other GraphQL tooling that parses source documents according to spec.
From the spec:
StringCharacter ::
- SourceCharacter but not
"or\orLineTerminator\uEscapedUnicode\EscapedCharacterLineTerminator
- New Line (U+000A)
- Carriage Return (U+000D)New Line (U+000A)
- Carriage Return (U+000D)New Line (U+000A)
Unfortunately this is also a breaking change as sources that used to parse will now fail. Steps to resolution:
- Parameterize parser accordingly.
- Hook breaking behavior to a schema-level setting for spec-compliant behaviors. Should also include:
- https://github.com/rmosolgo/graphql-ruby/issues/4403
- https://github.com/rmosolgo/graphql-ruby/issues/4351
GraphQL schema
require "graphql"
class ExampleSchema < GraphQL::Schema
class Mutation < GraphQL::Schema::Object
field :send_email, String, null: false do
argument :message, String, required: true
end
def send_email(message:)
message
end
end
mutation Mutation
end
GraphQL query
This input uses a block string wrapped in """, and works as expected...
puts ExampleSchema.execute(
query: %|
mutation {
sendEmail(message: """
Hello,
World!
Yours,
GraphQL.
""")
}
|
).to_h.dig("data", "sendEmail")
However, this input uses a basic string wrapped in ", and should therefore fail to parse because it contains \u0004 characters...
puts ExampleSchema.execute(
query: %|
mutation {
sendEmail(message: "
Hello,
World!
Yours,
GraphQL.
")
}
|
).to_h.dig("data", "sendEmail")
Expected behavior
We should get a consistent parse error based on the GraphQL specification. It's bad for our ecosystem when GraphQL Ruby allows non-spec inputs that are not compatible with other spec implementations of GraphQL.
Actual behavior
GraphQL Ruby treats """ and " string delimiters interchangeably.
I've implemented the fix in #4834, but I still need to add some way to retain the current behavior.
👋 I'm exploring options for maintaining compatibility here and I want to make sure I pick an approach that works for you.
It's not as easy as passing a flag because parsing strings is implemented with Regexps (in Ragel or in Ruby):
https://github.com/rmosolgo/graphql-ruby/blob/33bc196023e471b602d97d7f82aa8f16b6159d40/graphql-c_parser/ext/graphql_c_parser_ext/lexer.rl#L47
https://github.com/rmosolgo/graphql-ruby/blob/33bc196023e471b602d97d7f82aa8f16b6159d40/lib/graphql/language/lexer.rb#L257-L259
Those Regexps are combined with other regexps (at build time with Ragel or at load time in Ruby) which get combined into lexer/parser routines. They're not easy to swap out at runtime:
https://github.com/rmosolgo/graphql-ruby/blob/33bc196023e471b602d97d7f82aa8f16b6159d40/graphql-c_parser/ext/graphql_c_parser_ext/lexer.rl#L276-L286
https://github.com/rmosolgo/graphql-ruby/blob/33bc196023e471b602d97d7f82aa8f16b6159d40/lib/graphql/language/lexer.rb#L63-L68
(It would be possible with the Ruby lexer, if there was a LEGACY_QUOTED_STRING regexp -- but not in C.)
So, I'm considering recommending a pre-processing step. (I haven't quite found a regexp which will work, but I'm getting close ...) Do you think that would work in your case, something like query_str = replace_single_quoted_newlines(query_str)?
(I got pretty deep into RegExps and ended up landing on a stringscanner based approach instead, in 7a356ec02)
Thanks @rmosolgo. This was a bigger problem for us when we had Rust running in front of Ruby and Ruby didn’t match the parser spec, so would end up with different parsing results. We’ve been moving in the direction of doing more with just Rust though. Regardless, glad to see this getting fixed in 3.0 to close the gap on standards.