refactor: (codeflash) ⚡️ Speed up function `build_output_logs` by 79%
📄 build_output_logs in src/backend/base/langflow/schema/schema.py
✨ Performance Summary:
-
Speed Increase: 📈
79%(0.79xfaster) -
Runtime Reduction: ⏱️ From
5.84 millisecondsdown to3.26 milliseconds(best of28runs)
📝 Explanation and details
Here is the optimized version of your Python program.
Changes and improvements.
- Consolidated some conditions within
get_messageandget_typefunctions for brevity and faster execution by reducing multiple checks. - Minor conditional optimizations within the
DataFrameclass. - Avoided unnecessary dictionary unpacking and update operations in
build_output_logs. - Some refactoring for consistency and readability.
✅ Correctness verification
The new optimized code was tested for correctness. The results are listed below:
| Test | Status | Details |
|---|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found | |
| 🌀 Generated Regression Tests | ✅ 13 Passed | See below |
| ⏪ Replay Tests | 🔘 None Found | |
| 🔎 Concolic Coverage Tests | 🔘 None Found | |
| 📊 Coverage | 84.6% |
🌀 Generated Regression Tests Details
Click to view details
from collections.abc import Generator
from enum import Enum
from typing import cast
import pandas as pd
# imports
import pytest # used for our unit tests
from langflow.schema.data import Data
from langflow.schema.dataframe import DataFrame
from langflow.schema.message import Message
from langflow.schema.schema import build_output_logs
from langflow.schema.serialize import recursive_serialize_or_str
from pandas import DataFrame as pandas_DataFrame
from pydantic import BaseModel
from typing_extensions import TypedDict
# unit tests
# Mock classes to simulate inputs
class MockVertex:
def __init__(self, outputs):
self.outputs = outputs
class MockComponentInstance:
def __init__(self, status=None, _results=None, _artifacts=None):
self.status = status
self._results = _results or {}
self._artifacts = _artifacts or {}
# Basic Functionality
def test_single_output_string_payload():
vertex = MockVertex(outputs=[{"name": "output1"}])
result = [MockComponentInstance(_results={"output1": "simple string"})]
expected_output = {"output1": {"message": "simple string", "type": "text"}}
codeflash_output = build_output_logs(vertex, result)
def test_multiple_outputs_simple_data_types():
vertex = MockVertex(outputs=[{"name": "output1"}, {"name": "output2"}, {"name": "output3"}])
result = [MockComponentInstance(_results={"output1": "string", "output2": {"key": "value"}, "output3": ["item1", "item2"]})]
expected_output = {
"output1": {"message": "string", "type": "text"},
"output2": {"message": {"key": "value"}, "type": "object"},
"output3": {"message": ["item1", "item2"], "type": "array"}
}
codeflash_output = build_output_logs(vertex, result)
# Handling Complex Data Types
def test_payload_dataframe():
vertex = MockVertex(outputs=[{"name": "output1"}])
df = DataFrame([{"col1": "val1", "col2": "val2"}])
result = [MockComponentInstance(_results={"output1": df})]
expected_output = {"output1": {"message": [{"col1": "val1", "col2": "val2"}], "type": "array"}}
codeflash_output = build_output_logs(vertex, result)
def test_payload_message_object():
vertex = MockVertex(outputs=[{"name": "output1"}])
msg = Message(text="This is a message")
result = [MockComponentInstance(_results={"output1": msg})]
expected_output = {"output1": {"message": "This is a message", "type": "message"}}
codeflash_output = build_output_logs(vertex, result)
# Handling Streams
def test_stream_with_url():
vertex = MockVertex(outputs=[{"name": "output1"}])
result = [MockComponentInstance(_results={"output1": {"stream_url": "http://example.com"}})]
expected_output = {"output1": {"message": {"location": "http://example.com"}, "type": "stream"}}
codeflash_output = build_output_logs(vertex, result)
# Handling Edge Cases
def test_empty_payload():
vertex = MockVertex(outputs=[{"name": "output1"}])
result = [MockComponentInstance(_results={"output1": None})]
expected_output = {"output1": {"message": None, "type": "unknown"}}
codeflash_output = build_output_logs(vertex, result)
def test_unknown_types():
vertex = MockVertex(outputs=[{"name": "output1"}])
result = [MockComponentInstance(_results={"output1": 12345})]
expected_output = {"output1": {"message": 12345, "type": "unknown"}}
codeflash_output = build_output_logs(vertex, result)
# Error Handling
def test_invalid_dataframe_initialization():
vertex = MockVertex(outputs=[{"name": "output1"}])
invalid_data = [Data(data={"key": "value"}), {"key": "value"}] # Mixed types
with pytest.raises(ValueError, match="List items must be either all Data objects or all dictionaries"):
DataFrame(invalid_data)
# Large Scale Test Cases
def test_large_dataframe():
vertex = MockVertex(outputs=[{"name": "output1"}])
large_df = DataFrame([{"col1": f"val{i}", "col2": f"val{i}"} for i in range(1000)])
result = [MockComponentInstance(_results={"output1": large_df})]
expected_output = {"output1": {"message": large_df.to_dict(orient="records"), "type": "array"}}
codeflash_output = build_output_logs(vertex, result)
def test_large_list_of_outputs():
outputs = [{"name": f"output{i}"} for i in range(100)]
vertex = MockVertex(outputs=outputs)
result = [MockComponentInstance(_results={f"output{i}": f"value{i}" for i in range(100)})]
expected_output = {f"output{i}": {"message": f"value{i}", "type": "text"} for i in range(100)}
codeflash_output = build_output_logs(vertex, result)
# Special Cases
def test_status_handling():
vertex = MockVertex(outputs=[{"name": "output1"}])
result = [MockComponentInstance(status=None, _results={"output1": "result"}, _artifacts={})]
expected_output = {"output1": {"message": "result", "type": "text"}}
codeflash_output = build_output_logs(vertex, result)
def test_mixed_data_types_in_outputs():
vertex = MockVertex(outputs=[{"name": "output1"}, {"name": "output2"}, {"name": "output3"}])
result = [MockComponentInstance(_results={"output1": "text", "output2": {"key": "value"}, "output3": DataFrame([{"col1": "val1"}])})]
expected_output = {
"output1": {"message": "text", "type": "text"},
"output2": {"message": {"key": "value"}, "type": "object"},
"output3": {"message": [{"col1": "val1"}], "type": "array"}
}
codeflash_output = build_output_logs(vertex, result)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from collections.abc import Generator
from enum import Enum
from typing import cast
import pandas as pd
# imports
import pytest # used for our unit tests
from langflow.schema.data import Data
from langflow.schema.dataframe import DataFrame
from langflow.schema.message import Message
from langflow.schema.schema import build_output_logs
from langflow.schema.serialize import recursive_serialize_or_str
from pandas import DataFrame as pandas_DataFrame
from pydantic import BaseModel
from typing_extensions import TypedDict
# unit tests
CodSpeed Performance Report
Merging #5324 will degrade performances by 26.19%
Comparing codeflash-ai:codeflash/optimize-build_output_logs-2024-12-11T11.39.45 (d1fc2cd) with main (9c23759)
Summary
❌ 2 regressions
✅ 13 untouched benchmarks
:warning: Please fix the performance issues or acknowledge them on CodSpeed.
Benchmarks breakdown
| Benchmark | main |
codeflash-ai:codeflash/optimize-build_output_logs-2024-12-11T11.39.45 |
Change | |
|---|---|---|---|---|
| ❌ | test_successful_run_with_input_type_any |
256.2 ms | 340.1 ms | -24.68% |
| ❌ | test_successful_run_with_output_type_any |
238.5 ms | 323.1 ms | -26.19% |
Hi! I'm autofix.ci, a bot that automatically fixes trivial issues such as code formatting in pull requests.
I would like to apply some automated changes to this pull request, but it looks like I don't have the necessary permissions to do so. To get this pull request into a mergeable state, please do one of the following two things:
- Allow edits by maintainers for your pull request, and then re-trigger CI (for example by pushing a new commit).
- Manually fix the issues identified for your pull request (see the GitHub Actions output for details on what I would like to change).
@cbornet can we merge this approved PR?
@misrasaurabh1 There is an issue with starter projects. Can you check https://github.com/langflow-ai/langflow/pull/5324#issuecomment-2581342703 and resolve ?