sgqlc
sgqlc copied to clipboard
Operation codegen does not find variables with DataHub graphql code
🐜 Bug Report
I get the following errors related to GraphQL variables:
(datahub) georvic@georvic-IdeaPad-3-15ITL6:~/repos/datahub-metadata$ sgqlc-codegen operation my_schema ./operations/search.py /home/georvic/repos/datahub/datahub-web-react/src/graphql/search.graphql
no variable named 'includeAssertions' at /home/georvic/repos/datahub/datahub-web-react/src/graphql/search.graphql:831:64
and
(datahub) georvic@georvic-IdeaPad-3-15ITL6:~/repos/datahub-metadata$ sgqlc-codegen operation my_schema ./operations/lineage.py /home/georvic/repos/datahub/datahub-web-react/src/graphql/lineage.graphql
no variable named 'separateSiblings' at /home/georvic/repos/datahub/datahub-web-react/src/graphql/lineage.graphql:246:93
Here are the links to the files in question:
- https://github.com/datahub-project/datahub/blob/42260fc5d806d57f80c6d9e47c7653b589b3ede9/datahub-web-react/src/graphql/lineage.graphql#L246
- https://github.com/datahub-project/datahub/blob/42260fc5d806d57f80c6d9e47c7653b589b3ede9/datahub-web-react/src/graphql/search.graphql#L837
Expected Behavior
The Python scripts with operations should be generated without these errors.
Current Behavior
The two errors above appear and the two output Python scripts remain empty.
Possible Solution
I'll try to debug more this when I have time.
Edit:
The issue went away after moving the queries to the top, before the fragments.
Steps to Reproduce
- pip install sgqlc==16.2 graphql-core==3.2.3 # with Python 3.7.16
- git clone https://github.com/datahub-project/datahub.git
- sgqlc-codegen operation my_schema ./search.py ./datahub/datahub-web-react/src/graphql/search.graphql
- sgqlc-codegen operation my_schema ./lineage.py ./datahub/datahub-web-react/src/graphql/lineage.graphql
Context (Environment)
I'm trying to auto-generate a Python client to access DataHub's GraphQL endpoints. In the past, it's been difficult to deal with this without implementing my own complicated Python packages. I believe that sgqlc has the potential to simplify a good amount of issues I've faced with DataHub in this regard.
I gave more context on this in the official DataHub Slack account: https://datahubspace.slack.com/archives/C02QMLWJG12/p1687255354060359
So far, using sgqlc seems the most promising approach!
These two errors, and other one that was fixed by this PR, are the only problems I've encountered so far. All the other operation scripts have been generated successfully. I could re-write the graphql code used by DataHub in my own package, but it would be nice if the library itself can handle it (because it's valid code in any case).
In the case of the lineage.graphql code,
this is the lineageFields fragment where the error takes place:
- https://github.com/datahub-project/datahub/blob/42260fc5d806d57f80c6d9e47c7653b589b3ede9/datahub-web-react/src/graphql/lineage.graphql#L233C10-L233C23
This other fragment called fullLineageResults uses lineageFields
- https://github.com/datahub-project/datahub/blob/42260fc5d806d57f80c6d9e47c7653b589b3ede9/datahub-web-react/src/graphql/lineage.graphql#L293
and the query called getEntityLineage calls the fullLineageResults fragment:
- https://github.com/datahub-project/datahub/blob/42260fc5d806d57f80c6d9e47c7653b589b3ede9/datahub-web-react/src/graphql/lineage.graphql#L337
Added a traceback here:
act/src/graphql/lineage.graphql
File "/home/georvic/anaconda3/envs/datahub/bin/sgqlc-codegen", line 8, in <module>
sys.exit(main())
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/__init__.py", line 136, in main
args.func(args)
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/operation.py", line 1018, in handle_command
gen.write()
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/operation.py", line 881, in write
self.write_operations()
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/operation.py", line 906, in write_operations
self.write_operation(source)
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/operation.py", line 916, in write_operation
for kind, name, code in visit(gql, visitor):
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/graphql/language/visitor.py", line 261, in visit
result = visit_fn(node, key, parent, path, ancestors)
File "/home/georvic/anaconda3/envs/datahub/lib/python3.7/site-packages/sgqlc/codegen/operation.py", line 348, in leave_variable
traceback.print_stack()
no variable named 'separateSiblings' at /home/georvic/repos/datahub/datahub-web-react/src/graphql/lineage.graphql:246:93
Oh, I noticed that by moving the queries to the top the issue went away.
Is this expected by the sgqlc?