schema_salad
schema_salad copied to clipboard
Line numbers
I made changes to python_codegen.py
, python_codegen_support.py
, and introduced a test file test_line_numbers.py
that intergrates with the test suite.
I identified several blockers within the current code preventing line numbers from being associated with keys during the saving process.
During the loading process, the cwl is read in and saved as a CommentedMap, which has associated line numbers. However in the _document_load
method in python_codegen_support.py
the CommentedMap was replaced with a dictionary
doc = {
k: v
for k, v in doc.items()
if k not in ("$namespaces", "$schemas", "$base")
}
I replaced this code with
if "$namespaces" in doc:
doc.pop("$namespaces")
if "$schemas" in doc:
doc.pop("$schemas")
if "$base" in doc:
doc.pop("$base")
to keep doc in CommentedMap form.
Additionally, I noticed in the fromDoc
method doc
was being set to None or overriden to be something else, so I saved the original passed in doc as self._doc
, following the naming conventions.
I wanted to use the lc info from the original YAML passed in, so I modified the save method for each class to take in line_numbers, a CommentedMap. If line_numbers isn't null, it replaces the self._doc field. This is done to save the original CommentedMap and propagate it downwards.
python_codegen_support.py
I added several methods.
I added a method that extracts the max_line (+ 1) number from a CommentedMap. This iterates through the child with the highest line number until it reaches the end). This is used to insert the line column info for new fields in the returned doc.
I added a method that adds a the kv lc info into the returned doc. This is the real meat of the change. This takes a CommentedMap to insert into, an old CommentedMap, a dictionary of line numbers, and a dictionary of line numbers to maximum col used in the line, and a max_len
variable. First the method checks if the key is in the line numbers, and then inserts the old lc info directly info the new Commented Map. Then, if the key isn't in the line numbers, it checks if the value is in the line numbers and inserts it using that line number with an adjusted column number (based on the length of the key and the maximum col for that line). It then checks if the value is in the old_doc, and inserts with that lc information. Finally, if neither the key or the val is the line numbers, it inserts it to max_len
, and increases max_len
by 1. It has appropriate logic for DSL expansion:
elif isinstance(val, str): # Logic for DSL expansion with "?"
if val + "?" in line_numbers:
line = line_numbers[val + "?"]["line"] + shift
if line in inserted_line_info:
line = max_line
col = line_numbers[val + "?"]["col"]
new_doc.lc.add_kv_line_col(key, [line, col, line, col + len(key) + 2])
inserted_line_info[line] = col + len(key) + 2
I added a method that pulls out the lc info for all kv pairs in a Commented doc. For example, if a CommentedMap was like orderddict("key, "value")
with lc info ["key": [1, 0, 1, 6]]
it would return {"key": {"line":1, "col": 0}, "value':{"line":1, "col":6}}
I also modified the save method. It changes the return type from list/dict to CommentedSeq/CommentedMap, takes in a doc
field, and if the k/v pair is in the doc
, it adds the lc
info to the return type.
I added a method, iterate_through_doc
, that has no type check and takes a list of keys, and iterates through the global doc to the appropriate place. It has no type check since it goes from CommentedMap -> CommentedSeq before eventually ending up at a CommentedMap (or None)
python_codegen.py
I modified several things in python_codegen.py
First, I modified the fromDoc attribute to save the self._doc
attribute to the class.
I modified the save method. I changed the return type r
from dict to CommentedMap. I added the code to override the self._doc
, calculate max_len
, line_numbers
, and set an empty dictionary to store col
info. I also updated max_len
after inserting each class attribute to r
by calling add_kv
, which also adds the lc value to r
.
To prevent issues of something like the outputs
key being before an inputs
key and overexpanding, causing inconsistency with line numbers, I iterate through all keys in the line number doc and add the line numbers, before going through all attributes like normal.
if isinstance(key, str):
if hasattr(self, key):
if getattr(self, key) is not None:
#add lc info
Additionally, due to array expansion and DSL expansion, sometimes there is a shift down. To appropriately make sure everything ends up on the same line, I added shift
counter that says how many lines to shift down for a value.
test_line_numbers
I added 3 tests.
- One test is
outputs
field being beforeinputs
. - One test checks secondary files DSL expansion.
- One test checks type DSL expansion.
Thank you @acoleman2000 for this! Can you run make cleanup
?
To re-create "metaschema.py" do
schema-salad-tool --codegen=python schema_salad/metaschema/metaschema.yml > schema_salad/metaschema.py
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
138e249
) 83.68% compared to head (be53207
) 83.63%.
:exclamation: Current head be53207 differs from pull request most recent head 3afd4b0. Consider uploading reports for the commit 3afd4b0 to get more accurate results
Additional details and impacted files
@@ Coverage Diff @@
## main #647 +/- ##
==========================================
- Coverage 83.68% 83.63% -0.06%
==========================================
Files 22 22
Lines 4580 4497 -83
Branches 1239 1242 +3
==========================================
- Hits 3833 3761 -72
+ Misses 483 470 -13
- Partials 264 266 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.