langflow
langflow copied to clipboard
⚡️ Speed up function `find_names_in_code` by 7,942% in `src/backend/base/langflow/utils/validate.py`
📄 find_names_in_code() in src/backend/base/langflow/utils/validate.py
📈 Performance improved by 7,942% (79.42x faster)
⏱️ Runtime went down from 22.2 milliseconds to 277 microseconds
Explanation and details
Certainly! One way to optimize this function is by using a set for the names instead of a list, improving the average time complexity of checking if a name is present in the code. Additionally, since searching for substrings in a string is already quite efficient, the primary optimization focuses on reducing redundancy and overhead.
Here's the optimized version of the program.
Explanation.
- Conversion to Set: Convert
namesto a set (names_set) to ensure that each name is unique and also potentially speed up membership testing, even though we're generally iterating over the elements. - Early Return Optimization: By using a set
found_namesdirectly instead of a set comprehension, we can avoid the overhead of creating the set in one go and can exit early if needed.
This version should be more efficient in typical scenarios, though the actual run-time improvements will depend on the specific use case and data.
Correctness verification
The new optimized code was tested for correctness. The results are listed below.
🔘 (none found) − ⚙️ Existing Unit Tests
✅ 34 Passed − 🌀 Generated Regression Tests
(click to show generated tests)
# imports
import pytest # used for our unit tests
from src.backend.base.langflow.utils.validate import find_names_in_code
# unit tests
def test_single_name_present():
# Test when a single name is present in the code
codeflash_output = find_names_in_code("def foo(): pass", ["foo"])
codeflash_output = find_names_in_code("print(bar)", ["bar"])
# Outputs were verified to be equal to the original implementation
def test_multiple_names_present():
# Test when multiple names are present in the code
codeflash_output = find_names_in_code("def foo(): pass\nbar = 1", ["foo", "bar"])
codeflash_output = find_names_in_code("a = 1\nb = 2\nc = 3", ["a", "b", "c"])
# Outputs were verified to be equal to the original implementation
def test_no_names_present():
# Test when no names are present in the code
codeflash_output = find_names_in_code("def foo(): pass", ["bar"])
codeflash_output = find_names_in_code("print('hello')", ["foo", "bar"])
# Outputs were verified to be equal to the original implementation
def test_empty_code_string():
# Test when the code string is empty
codeflash_output = find_names_in_code("", ["foo", "bar"])
codeflash_output = find_names_in_code("", [])
# Outputs were verified to be equal to the original implementation
def test_empty_names_list():
# Test when the names list is empty
codeflash_output = find_names_in_code("def foo(): pass", [])
codeflash_output = find_names_in_code("print('hello')", [])
# Outputs were verified to be equal to the original implementation
def test_names_as_substrings():
# Test when names are substrings of the code
codeflash_output = find_names_in_code("foobar", ["foo"])
codeflash_output = find_names_in_code("foobarbaz", ["bar"])
# Outputs were verified to be equal to the original implementation
def test_case_sensitivity():
# Test case sensitivity
codeflash_output = find_names_in_code("def Foo(): pass", ["foo"])
codeflash_output = find_names_in_code("print(Bar)", ["bar"])
# Outputs were verified to be equal to the original implementation
def test_names_with_special_characters():
# Test names with special characters
codeflash_output = find_names_in_code("def foo_bar(): pass", ["foo_bar"])
codeflash_output = find_names_in_code("print(foo-bar)", ["foo-bar"])
# Outputs were verified to be equal to the original implementation
def test_large_code_string():
# Test with a large code string
large_code = "def foo(): pass\n" * 10000
codeflash_output = find_names_in_code(large_code, ["foo"])
large_code = "a = 1\n" * 10000
codeflash_output = find_names_in_code(large_code, ["a", "b", "c"])
# Outputs were verified to be equal to the original implementation
def test_large_names_list():
# Test with a large names list
codeflash_output = find_names_in_code("def foo(): pass", ["foo"] * 10000)
codeflash_output = find_names_in_code("print(bar)", ["bar"] * 10000)
# Outputs were verified to be equal to the original implementation
def test_both_large_code_and_names_list():
# Test with both large code and names list
large_code = "def foo(): pass\n" * 10000
codeflash_output = find_names_in_code(large_code, ["foo"] * 10000)
large_code = "a = 1\n" * 10000
codeflash_output = find_names_in_code(large_code, ["a", "b", "c"] * 10000)
# Outputs were verified to be equal to the original implementation
def test_names_with_spaces():
# Test names with spaces
codeflash_output = find_names_in_code("def foo bar(): pass", ["foo bar"])
codeflash_output = find_names_in_code("print(foo bar)", ["foo bar"])
# Outputs were verified to be equal to the original implementation
def test_names_with_numbers():
# Test names with numbers
codeflash_output = find_names_in_code("def foo123(): pass", ["foo123"])
codeflash_output = find_names_in_code("print(foo123)", ["foo123"])
# Outputs were verified to be equal to the original implementation
def test_names_with_mixed_characters():
# Test names with mixed characters
codeflash_output = find_names_in_code("def foo_bar123(): pass", ["foo_bar123"])
codeflash_output = find_names_in_code("print(foo-bar_123)", ["foo-bar_123"])
# Outputs were verified to be equal to the original implementation
def test_nested_names():
# Test nested names
codeflash_output = find_names_in_code("def foo():\n def bar(): pass", ["foo", "bar"])
codeflash_output = find_names_in_code("class Foo:\n def bar(self): pass", ["Foo", "bar"])
# Outputs were verified to be equal to the original implementation
def test_overlapping_names():
# Test overlapping names
codeflash_output = find_names_in_code("foobar", ["foo", "bar"])
codeflash_output = find_names_in_code("foobarbaz", ["foo", "bar", "baz"])
# Outputs were verified to be equal to the original implementation
def test_names_appearing_multiple_times():
# Test names appearing multiple times
codeflash_output = find_names_in_code("foo = 1\nfoo = 2\nfoo = 3", ["foo"])
codeflash_output = find_names_in_code("bar()\nbar()\nbar()", ["bar"])
# Outputs were verified to be equal to the original implementation
🔘 (none found) − ⏪ Replay Tests
Pull Request Validation Report
This comment is automatically generated by Conventional PR
Whitelist Report
| Whitelist | Active | Result |
|---|---|---|
| Pull request is submitted by a bot and should be ignored | ✅ | ✅ |
| Pull request is a draft and should be ignored | ✅ | ❌ |
| Pull request is made by a whitelisted user and should be ignored | ❌ | ❌ |
| Pull request is submitted by administrators and should be ignored | ❌ | ❌ |
Result
Pull request matches with one (or more) enabled whitelist criteria. Pull request validation is skipped.
Last Modified at 02 Aug 24 20:54 UTC
This pull request is automatically being deployed by Amplify Hosting (learn more).
Access this pull request here: https://pr-3179.dmtpw4p5recq1.amplifyapp.com
This PR has been automatically closed because the original PR #3216 by EvgenyK1 was closed.