azure-rest-api-specs icon indicating copy to clipboard operation
azure-rest-api-specs copied to clipboard

Update version of `oav` being consumed by `Model Validation` to `3.2.11`

Open scbedd opened this issue 2 years ago • 2 comments

Where-ever this configuration is located, need to update to version contained in this PR.

scbedd avatar Jun 27 '23 18:06 scbedd

Merged this pr prior to merging this check update. We want additionalProperties errors to pop.

scbedd avatar Jul 06 '23 20:07 scbedd

How to regression test oav

Context and setup

Current version of oav being used was 3.2.2. Before we can a that PR, we need to ensure that the newer version of oav doesn't cause new breaks. Please note this is all on my WSL install.

cd <oav-repo-root>
cd regression/azure-rest-api-specs
git pull
npm install -g [email protected]

followed by running:

mkdir -p regression_specs_against_322
for i in $(find ./ -iname "*.json" -type f -exec bash -c '[[ ! "$0" =~ .*examples.* && "$0" =~ .*specification.* ]]' {} \; -print); \
   do newloc=${i////.};oav validate-spec $i > "./regression_specs_against_322/${newloc:2}.out" ; \
   done
mkdir -p regression_examples_against_322
for i in $(find ./ -iname "*.json" -type f -exec bash -c '[[ ! "$0" =~ .*examples.* && "$0" =~ .*specification.* ]]' {} \; -print); \
   do newloc=${i////.};oav validate-example $i > "./regression_examples_against_322/${newloc:2}.out" ; \
   done

After letting that churn through, we need to install 3.2.11 and run the same things, then compare the outputs.

npm install -g [email protected]
mkdir -p regression_specs_against_3211
for i in $(find ./ -iname "*.json" -type f -exec bash -c '[[ ! "$0" =~ .*examples.* && "$0" =~ .*specification.* ]]' {} \; -print); \
   do newloc=${i////.};oav validate-spec $i > "./regression_specs_against_3211/${newloc:2}.out" ; \
   done
mkdir -p regression_examples_against_3211
for i in $(find ./ -iname "*.json" -type f -exec bash -c '[[ ! "$0" =~ .*examples.* && "$0" =~ .*specification.* ]]' {} \; -print); \
   do newloc=${i////.};oav validate-example $i > "./regression_examples_against_3211/${newloc:2}.out" ; \
   done

Processing of output

I copy-paste the outputs of the oav invocations from the CLI that invoked them. I placed those in four files:

  • example_regression_output_322.txt
  • example_regression_output_3211.txt
  • spec_regression_output_322.yml
  • spec_regression_output_3211.yml

Other than that, we now have four folders generated alongside where we :

  • regression_examples_against_322
  • regression_specs_against_322
  • regression_examples_against_3211
  • regression_specs_against_3211

The files present within each folder represent each individually parsed spec.

  • [x] Find out when stuff is printed by oav for both examples and specs
    • Looks like we need to examine the text output from spec run, and then look at the per-file output for the examples run. Not certain why it's different, but it is!

Processing examples output

For examples, we examine the per-file output within the regression_against_blah folders. For specs checks, we need to examine the actual output to CLI.

import os
import sys
import re

cwd = os.getcwd()
full_path = os.path.join(cwd, sys.argv[1])
all_files = os.listdir(full_path)
outputs = []
codelines = {}
line_regex = re.compile(r"^\s*code\:\s*\'([A-Za-z_-]+)\'\,\s*$", flags=re.MULTILINE)
destination_file = f"{sys.argv[1]}.csv"

# used to count instances of a key entry
def record_entry(new_value, dict):
   if new_value in dict.keys():
      dict[new_value] += 1
   else:
      dict[new_value] = 1

# processes an output file and provides the summary for correlation in a CSV file later.
# location present for debugging purposes :P 
def process_output_file(content, location) -> str:
   matches = [m.strip() for m in line_regex.findall(content)]
   result_dict = {}
   result_list = []

   for match in matches:
      record_entry(match, result_dict)

   for key in result_dict:
      result_list.append(f"{key}: {result_dict[key]}")

   return ",".join(result_list)

with open(destination_file, 'w', encoding='utf-8') as output:
   for file_output in all_files:
      location = os.path.join(full_path, file_output)
      result = '' # default error

      with open(location, 'r') as f:
         content = f.read()
         if "Validation completes without errors." in content:
            result = '' # no error detected for this file
         else:
            # An error file looks like this:

            # Error reported:
            # {
            #   code: 'XMS_EXAMPLE_NOTFOUND_ERROR',
            #   message: 'x-ms-example not found in StorageAccounts_CheckNameAvailability.',
            #   schemaUrl: './specification/storage/resource-manager/Microsoft.Storage/stable/2016-01-01/storage.json',
            #   exampleUrl: undefined,
            #   schemaPosition: { line: 22, column: 15 },
            #   schemaJsonPath: undefined,
            #   examplePosition: undefined,
            #   exampleJsonPath: undefined,
            #   severity: 0,
            #   source: 'global',
            #   operationId: 'StorageAccounts_CheckNameAvailability',
            #   level: '\x1B[31merror\x1B[39m'
            # }
            # {
            #   code: 'XMS_EXAMPLE_NOTFOUND_ERROR',
            #   message: 'x-ms-example not found in StorageAccounts_Create.',
            #   schemaUrl: './specification/storage/resource-manager/Microsoft.Storage/stable/2016-01-01/storage.json',
            #   exampleUrl: undefined,
            #   schemaPosition: { line: 56, column: 14 },
            #   schemaJsonPath: undefined,
            #   examplePosition: undefined,
            #   exampleJsonPath: undefined,
            #   severity: 0,
            #   source: 'global',
            #   operationId: 'StorageAccounts_Create',
            #   level: '\x1B[31merror\x1B[39m'
            # }
            # so we effectively need just to count the number of code: '<ERRORCODE>' to summarize the issues in this output
            # https://regex101.com/r/6eO4k8/1
            result = process_output_file(content, location)
      output.write(f"{file_output},\"{result}\"{os.linesep}")

Generate the summary with:

cd <oav root>/regression/azure-rest-api-specs
python path/to/summarizer.py regression_examples_against_3211
python path/to/summarizer.py regression_examples_against_322

Here is an excellent summary of the types of example issues that no longer pop

Processing specifications output

To get the specifications output, you need to grab the CLI output from the original command invocation. Unfortunately oav is not set up so that a simply > redirect from validate-spec like we do for validate-example will work. For that one, we need to copy-paste the actual CLI output and process THAT as a file.

This is what that looks like in a python script. (Examine spec_regression_output_322.yml for the output copy-pasted from CLI)

import os
import sys
import re
from typing import List

cwd = os.getcwd()
full_path = os.path.join(cwd, sys.argv[1])

def is_issue_start(line: str):
    return line.startswith("Semantically validating ")

def record_entry(new_value, dict):
    if new_value in dict.keys():
        dict[new_value] += 1
    else:
        dict[new_value] = 1

def get_name_from_error_output(content) -> str:
    line_regex = re.compile(r"Semantically validating ([\.\/\-\_a-zA-Z0-9]+)\s*", flags=re.MULTILINE)
    matches = [m.strip() for m in line_regex.findall(content)]

    return matches[0].replace("./", "")

def process_error_output(content) -> str:
    line_regex = re.compile(r"^\s*code\:\s*([A-Za-z_-]+)\s*$", flags=re.MULTILINE)
    matches = [m.strip() for m in line_regex.findall(content)]
    result_dict = {}
    result_list = []

    for match in matches:
        record_entry(match, result_dict)

    for key in result_dict:
        result_list.append(f"{key}: {result_dict[key]}")

    return ",".join(result_list)

def get_error_chunks(location: str) -> List[str]:
    with open(location, 'r') as f:
        content = f.readlines()

    all_issues = []
    current_issue = []
    in_issue = False

    if content:
        line = content.pop(0)

        while line:
            if in_issue:
                if is_issue_start(line):
                    all_issues.append("".join(current_issue))
                    current_issue = [ line ]
                else:
                    current_issue.append(line)
            else:
                if is_issue_start(line):
                    in_issue = True
                    current_issue.append(line)

            if content:
                line = content.pop(0)
            else:
                line = None

        if current_issue:
            all_issues.append("".join(current_issue))

    return all_issues


errors = get_error_chunks(full_path)

with open('output.csv', 'w', encoding='utf-8') as f:
    for error in errors:
        error_detail = process_error_output(error)
        name = get_name_from_error_output(error)
        f.write(f"{name},\"{error_detail.strip()}\"")

Generate the summary with:

cd <oav root>/regression/azure-rest-api-specs
python /path/to/specification_output_summarizer.py spec_regression_output_3211.yml
# grab output
python /path/to/specification_output_summarizer.py spec_regression_output_322.yml
# grab output

Viewable Excel Diff

Visible here

scbedd avatar Jul 12 '23 21:07 scbedd

Rolled to Prod

scbedd avatar Jul 14 '23 18:07 scbedd