langflow icon indicating copy to clipboard operation
langflow copied to clipboard

⚡️ Speed up function `extract_input_variables` by 162% in `src/backend/base/langflow/utils/payload.py`

Open codeflash-ai[bot] opened this issue 1 year ago • 2 comments

📄 extract_input_variables() in src/backend/base/langflow/utils/payload.py

📈 Performance improved by 162% (1.62x faster)

⏱️ Runtime went down from 582 microseconds to 222 microseconds

Explanation and details

To optimize the given Python program, we will focus on reducing unnecessary operations and improving the efficiency of the code that extracts variables. Here is the enhanced version of the program.

Changes made.

  1. Removed the unnecessary contextlib.suppress(Exception) and replaced it with more specific exception handling (try-except KeyError), which slightly reduces overhead.
  2. Precompiled the regular expression (prompt_extract = re.compile(r"\{(.*?)\}")) for better performance during multiple uses within the loop.
  3. Directly assigned template and template_type to avoid repeated dictionary lookups.

These optimizations help the program run faster, especially when handling a large list of nodes.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 12 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import contextlib
import re

import pytest  # used for our unit tests
from src.backend.base.langflow.utils.payload import extract_input_variables

# unit tests

def test_single_node_prompt_type_single_variable():
    nodes = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {"value": ["name"]}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_single_node_prompt_type_multiple_variables():
    nodes = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}, your age is {age}"}, "input_variables": {}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}, your age is {age}"}, "input_variables": {"value": ["name", "age"]}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_single_node_few_shot_type_single_variable():
    nodes = [{"data": {"node": {"template": {"_type": "few_shot", "prefix": {"value": "Dear {title}"}, "suffix": {"value": "Regards {name}"}, "input_variables": {}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "few_shot", "prefix": {"value": "Dear {title}"}, "suffix": {"value": "Regards {name}"}, "input_variables": {"value": ["title", "name"]}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_empty_nodes_list():
    nodes = []
    expected = []
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_node_without_input_variables_key():
    nodes = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_node_without_template_key():
    nodes = [{"data": {"node": {}}}]
    expected = [{"data": {"node": {}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_node_with_different_template_type():
    nodes = [{"data": {"node": {"template": {"_type": "unknown", "template": {"value": "Hello {name}"}, "input_variables": {}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "unknown", "template": {"value": "Hello {name}"}, "input_variables": {"value": []}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_node_with_empty_template_values():
    nodes = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": ""}, "input_variables": {}}}}}]
    expected = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": ""}, "input_variables": {"value": []}}}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_invalid_node_structure():
    nodes = [{"data": {"node": {"template": "invalid_structure"}}}]
    expected = [{"data": {"node": {"template": "invalid_structure"}}}]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_large_number_of_nodes():
    nodes = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {}}}}}] * 1000
    expected = [{"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {"value": ["name"]}}}}}] * 1000
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_mixed_template_types_in_nodes():
    nodes = [
        {"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {}}}}},
        {"data": {"node": {"template": {"_type": "few_shot", "prefix": {"value": "Dear {title}"}, "suffix": {"value": "Regards {name}"}, "input_variables": {}}}}}
    ]
    expected = [
        {"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {"value": ["name"]}}}}},
        {"data": {"node": {"template": {"_type": "few_shot", "prefix": {"value": "Dear {title}"}, "suffix": {"value": "Regards {name}"}, "input_variables": {"value": ["title", "name"]}}}}}
    ]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

def test_mixed_valid_and_invalid_nodes():
    nodes = [
        {"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {}}}}},
        {"data": {"node": {"template": "invalid_structure"}}}
    ]
    expected = [
        {"data": {"node": {"template": {"_type": "prompt", "template": {"value": "Hello {name}"}, "input_variables": {"value": ["name"]}}}}},
        {"data": {"node": {"template": "invalid_structure"}}}
    ]
    codeflash_output = extract_input_variables(nodes)
    # Outputs were verified to be equal to the original implementation

🔘 (none found) − ⏪ Replay Tests

codeflash-ai[bot] avatar Aug 02 '24 12:08 codeflash-ai[bot]

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-3165.dmtpw4p5recq1.amplifyapp.com

Pull Request Validation Report

This comment is automatically generated by Conventional PR

Whitelist Report

Whitelist Active Result
Pull request is submitted by a bot and should be ignored
Pull request is a draft and should be ignored
Pull request is made by a whitelisted user and should be ignored
Pull request is submitted by administrators and should be ignored

Result

Pull request matches with one (or more) enabled whitelist criteria. Pull request validation is skipped.

Last Modified at 02 Aug 24 12:58 UTC

github-actions[bot] avatar Aug 02 '24 12:08 github-actions[bot]

This PR has been automatically closed because the original PR #3216 by EvgenyK1 was closed.

codeflash-ai[bot] avatar Aug 06 '24 14:08 codeflash-ai[bot]