continuous-eval icon indicating copy to clipboard operation
continuous-eval copied to clipboard

Implement SQL AST comparison metric

Open devin-ai-integration[bot] opened this issue 9 months ago • 0 comments

Pull Request Description

Summary

This pull request introduces a new SQL AST comparison metric to the continuous-eval repository. The new metric, SQLASTSimilarity, compares SQL queries using Abstract Syntax Tree (AST) similarity, leveraging the sqlglot library.

Changes

  • Added the SQLASTSimilarity class to the code_deterministic_metrics.py file.
  • Imported the diff and parse_one functions from the sqlglot library.
  • Imported the Keep class from the sqlglot.diff module.
  • Implemented the __call__ method in the SQLASTSimilarity class to parse SQL queries into ASTs and calculate similarity scores.
  • Implemented the _calculate_similarity method in the SQLASTSimilarity class to calculate the similarity score between two ASTs by using the diff function to get the differences between the trees, counting the total changes, and calculating the total number of nodes in both trees. The similarity score is calculated as 1 - (total_changes / total_nodes).

Testing

  • Created a new test file, test_code_deterministic_metrics.py, with unit tests for the SQLASTSimilarity class.
  • Added test methods to validate the functionality of the SQLASTSimilarity class, including tests for exact match, different queries, similar queries, and invalid queries.
  • Ran the tests using pytest, and all tests passed successfully.

Link to Devin run

https://preview.devin.ai/devin/696032ba45654233968d6a04f2bc5df3

Request for Review

Please review the changes and provide feedback. If everything looks good, kindly approve the pull request for merging.

Thank you!