langflow
langflow copied to clipboard
⚡️ Speed up function `replace_none_and_null_with_empty_str` by 33% in PR #10567 (`feat-agentic-ux`)
⚡️ This pull request contains optimizations for PR #10567
If you approve this dependent PR, these changes will be merged into the original PR branch feat-agentic-ux.
This PR will be automatically closed if the original PR is merged.
📄 33% (0.33x) speedup for replace_none_and_null_with_empty_str in src/backend/base/langflow/agentic/mcp/support.py
⏱️ Runtime : 2.69 milliseconds → 2.02 milliseconds (best of 149 runs)
📝 Explanation and details
The optimized code achieves a 33% speedup through several key micro-optimizations that reduce overhead in the critical dictionary processing loop:
Primary Optimizations:
-
Eliminated repeated set creation: The original code created
set(required_fields) if required_fields else set()on every function call (line with 974.4ns per hit). The optimized version creates it once asNonewhen not needed, avoiding unnecessary empty set allocation. -
Precomputed string lookup set: Moved
str_nulls = {"null", "nan", "infinity", "-infinity"}outside the inner function to avoid repeated set creation during string comparisons. -
Optimized type checking flow: Restructured
convert_value()withelifchains instead of independentifstatements, enabling early returns and reducing redundant type checks on the same value. -
Removed exception handling overhead: Eliminated the try/except block around
math.isnan()since it's now only called on confirmed float types, removing exception handling cost. -
Conditional required fields processing: Added
if required_fields_set:check to skip set operations entirely when no required fields are specified, avoiding the costlyrequired_fields_set - new_dict.keys()operation (which took 672.3ns per hit in the original). -
Local variable optimization: Used
result_append = result = []to cache the append method for faster lookups in the tight loop.
Performance Impact: The line profiler shows the main bottleneck is the dictionary comprehension (60.8% → 63.8% of total time), but the optimizations reduced overall function time from 25.5ms to 23.1ms. The improvements are most pronounced in the set operations and conditional logic, which collectively saved ~2.4ms.
Workload Benefits: These optimizations are particularly effective for:
- Large datasets with many dictionaries (as shown in the 1000-item test cases)
- Cases where
required_fieldsis None/empty (avoids unnecessary set operations) - Mixed data types requiring frequent type checking in
convert_value()
The optimizations maintain identical behavior while reducing computational overhead in data processing pipelines.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 51 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Generated Regression Tests and Runtime
import math
imports
import pytest from langflow.agentic.mcp.support import replace_none_and_null_with_empty_str
unit tests
-------------------------------
Basic Test Cases
-------------------------------
def test_none_and_null_string_replacement(): # Test that None and "null" (case-insensitive) are replaced data = [ {"a": None, "b": "null", "c": "NuLl", "d": 1, "e": "something"}, ] expected = [ {"a": "Not available", "b": "Not available", "c": "Not available", "d": 1, "e": "something"}, ] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_nan_float_and_nan_string(): # Test that float('nan') and string "NaN" are replaced data = [ {"x": float('nan'), "y": "NaN", "z": 0.0}, ] expected = [ {"x": "Not available", "y": "Not available", "z": 0.0}, ] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_infinity_and_minus_infinity_strings(): # Test string "Infinity" and "-Infinity" are replaced data = [ {"a": "Infinity", "b": " -Infinity ", "c": "infinity", "d": "INFINITY", "e": 5}, ] expected = [ {"a": "Not available", "b": "Not available", "c": "Not available", "d": "Not available", "e": 5}, ] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_no_replacement_needed(): # Test dicts with no values to replace data = [ {"a": 1, "b": "hello", "c": 3.14}, ] expected = [ {"a": 1, "b": "hello", "c": 3.14}, ] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_required_fields_added(): # Test that missing required fields are added with "Not available" data = [ {"a": 1}, {"b": 2}, ] required_fields = ["a", "b", "c"] expected = [ {"a": 1, "b": "Not available", "c": "Not available"}, {"a": "Not available", "b": 2, "c": "Not available"}, ] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields)
-------------------------------
Edge Test Cases
-------------------------------
def test_empty_list(): # Test empty input list data = [] expected = [] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_empty_dict_in_list(): # Test list with empty dict and required fields data = [{}] required_fields = ["x", "y"] expected = [{"x": "Not available", "y": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields)
def test_non_dict_in_list(): # Test that non-dict elements are passed through unchanged data = [{"a": None}, 123, "hello", None] expected = [{"a": "Not available"}, 123, "hello", None] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_all_fields_none(): # All fields are None data = [{"a": None, "b": None}] expected = [{"a": "Not available", "b": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_whitespace_and_case_variants(): # Test whitespace and case variants of "null", "nan", "infinity" data = [ {"a": " null ", "b": " NAN ", "c": " Infinity", "d": " -infinity ", "e": " nUlL "}, ] expected = [ {"a": "Not available", "b": "Not available", "c": "Not available", "d": "Not available", "e": "Not available"}, ] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_isnat_attribute(): # Test object with isnat attribute set to True class Dummy: isnat = True data = [{"x": Dummy()}] expected = [{"x": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_isnat_attribute_false(): # Test object with isnat attribute set to False class Dummy: isnat = False data = [{"x": Dummy()}] expected = [{"x": data[0]["x"]}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_float_inf_and_minus_inf_not_replaced(): # Test float('inf') and float('-inf') are NOT replaced data = [{"a": float('inf'), "b": float('-inf')}] expected = [{"a": float('inf'), "b": float('-inf')}] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output
def test_required_fields_empty(): # Test with required_fields as empty list (should not add anything) data = [{"a": 1}] expected = [{"a": 1}] codeflash_output = replace_none_and_null_with_empty_str(data, [])
def test_required_fields_none(): # Test with required_fields as None (should not add anything) data = [{"a": 1}] expected = [{"a": 1}] codeflash_output = replace_none_and_null_with_empty_str(data, None)
def test_dict_with_extra_fields(): # Test dict with fields not in required_fields data = [{"a": 1, "b": 2, "c": 3}] required_fields = ["a"] expected = [{"a": 1, "b": 2, "c": 3}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields)
-------------------------------
Large Scale Test Cases
-------------------------------
def test_large_list_of_dicts(): # Test with 1000 dicts, each with a mix of values data = [ {"a": None if i % 2 == 0 else i, "b": "null" if i % 3 == 0 else str(i), "c": float('nan') if i % 5 == 0 else i} for i in range(1000) ] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output # Check that all replacements were made correctly for i, d in enumerate(result): if i % 2 == 0: pass else: pass if i % 3 == 0: pass else: pass if i % 5 == 0: pass else: pass
def test_large_required_fields(): # Test with many required fields required_fields = [f"field_{i}" for i in range(50)] data = [{"field_0": None, "field_1": "null"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields); result = codeflash_output for i in range(2, 50): pass
def test_large_dict_with_various_values(): # Test a single dict with 500 keys, alternating between None, "null", float('nan'), and valid data = [{ f"key_{i}": ( None if i % 4 == 0 else "null" if i % 4 == 1 else float('nan') if i % 4 == 2 else i ) for i in range(500) }] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output for i in range(500): v = result[0][f"key_{i}"] if i % 4 in (0, 1, 2): pass else: pass
def test_large_list_with_non_dicts(): # Test a large list mixing dicts and non-dicts data = [{"a": None} if i % 2 == 0 else i for i in range(1000)] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output for i in range(1000): if i % 2 == 0: pass else: pass
-------------------------------
Mutation-sensitive edge case
-------------------------------
def test_required_fields_added_even_if_dict_empty(): # Test that required_fields are added even if the dict is empty data = [{}] required_fields = ["foo", "bar"] expected = [{"foo": "Not available", "bar": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields)
def test_required_fields_not_overwrite_existing(): # Test that required_fields do not overwrite existing keys data = [{"foo": "hello"}] required_fields = ["foo", "bar"] expected = [{"foo": "hello", "bar": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields)
def test_string_values_not_replaced_unless_exact(): # Test that "nullify", "nana", "infinite", "-infinite" are NOT replaced data = [{"a": "nullify", "b": "nana", "c": "infinite", "d": "-infinite"}] expected = [{"a": "nullify", "b": "nana", "c": "infinite", "d": "-infinite"}] codeflash_output = replace_none_and_null_with_empty_str(data)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------ import math
imports
import pytest from langflow.agentic.mcp.support import replace_none_and_null_with_empty_str
unit tests
------------------- BASIC TEST CASES -------------------
def test_empty_list_returns_empty_list(): # Should handle empty input gracefully codeflash_output = replace_none_and_null_with_empty_str([])
def test_single_dict_with_no_special_values(): # Should leave normal values unchanged data = [{"a": 1, "b": "foo"}] expected = [{"a": 1, "b": "foo"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_replace_none_value(): # Should replace None with "Not available" data = [{"a": None, "b": 2}] expected = [{"a": "Not available", "b": 2}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_replace_null_string_case_insensitive(): # Should replace "null" (case-insensitive) with "Not available" data = [{"a": "null"}, {"a": "NULL"}, {"a": " Null "}] expected = [{"a": "Not available"}, {"a": "Not available"}, {"a": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_replace_nan_string_and_float(): # Should replace "nan" string and float('nan') with "Not available" data = [{"a": "NaN"}, {"a": float('nan')}] expected = [{"a": "Not available"}, {"a": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output
def test_replace_infinity_strings(): # Should replace "Infinity" and "-Infinity" strings with "Not available" data = [{"a": "Infinity"}, {"a": "-Infinity"}, {"a": " infinity "}] expected = [{"a": "Not available"}, {"a": "Not available"}, {"a": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_required_fields_adds_missing(): # Should add missing required fields with "Not available" data = [{"a": 1}] expected = [{"a": 1, "b": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=["a", "b"])
def test_required_fields_with_existing_fields(): # Should not overwrite existing fields data = [{"a": 1, "b": 2}] expected = [{"a": 1, "b": 2}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=["a", "b"])
def test_multiple_dicts_mixed_values(): # Should process each dict independently data = [ {"a": None, "b": "bar"}, {"a": "null", "b": "baz"}, {"a": 5, "b": float('nan')} ] expected = [ {"a": "Not available", "b": "bar"}, {"a": "Not available", "b": "baz"}, {"a": 5, "b": "Not available"} ] codeflash_output = replace_none_and_null_with_empty_str(data)
------------------- EDGE TEST CASES -------------------
def test_non_dict_elements_are_untouched(): # Should leave non-dict elements unchanged data = [{"a": None}, "notadict", 123, None] expected = [{"a": "Not available"}, "notadict", 123, None] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_empty_string_and_whitespace(): # Should not alter empty or whitespace-only strings (unless they match special values) data = [{"a": "", "b": " ", "c": "null "}] expected = [{"a": "", "b": " ", "c": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_boolean_and_zero_values(): # Should not alter boolean or zero values data = [{"a": False, "b": 0, "c": True}] expected = [{"a": False, "b": 0, "c": True}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_float_infinity(): # Should NOT replace float('inf') or float('-inf') (only string versions) data = [{"a": float('inf'), "b": float('-inf')}] expected = [{"a": float('inf'), "b": float('-inf')}] codeflash_output = replace_none_and_null_with_empty_str(data); result = codeflash_output
def test_dict_with_custom_object_isnat(): # Should replace if object has .isnat==True class Dummy: isnat = True data = [{"x": Dummy()}] expected = [{"x": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_custom_object_isnat_false(): # Should not replace if object has .isnat==False class Dummy: isnat = False data = [{"x": Dummy()}] expected = [{"x": data[0]["x"]}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_required_fields_empty_list(): # Should not add any fields if required_fields is empty data = [{"a": 1}] expected = [{"a": 1}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=[])
def test_required_fields_none(): # Should not add any fields if required_fields is None data = [{"a": 1}] expected = [{"a": 1}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=None)
def test_dict_with_nested_dicts(): # Should not recurse into nested dicts data = [{"a": {"b": None}}] expected = [{"a": {"b": None}}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_list_value(): # Should not alter list values data = [{"a": [None, "null", 1]}] expected = [{"a": [None, "null", 1]}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_tuple_value(): # Should not alter tuple values data = [{"a": (None, "null", 1)}] expected = [{"a": (None, "null", 1)}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_dict_with_set_value(): # Should not alter set values data = [{"a": {None, "null", 1}}] expected = [{"a": {None, "null", 1}}] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_required_fields_with_all_special_values(): # Should add missing fields and replace special values data = [{"a": None}] expected = [{"a": "Not available", "b": "Not available", "c": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=["a", "b", "c"])
def test_required_fields_with_overlap(): # Should not overwrite present fields, but add missing data = [{"a": 1, "b": None}] expected = [{"a": 1, "b": "Not available", "c": "Not available"}] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=["a", "b", "c"])
def test_dict_with_unusual_types(): # Should not alter values of unusual types (e.g., bytes) data = [{"a": b"bytes", "b": bytearray(b"foo")}] expected = [{"a": b"bytes", "b": bytearray(b"foo")}] codeflash_output = replace_none_and_null_with_empty_str(data)
------------------- LARGE SCALE TEST CASES -------------------
def test_large_list_of_dicts_all_none(): # Should handle large lists where all values are None data = [{"a": None, "b": None} for _ in range(500)] expected = [{"a": "Not available", "b": "Not available"} for _ in range(500)] codeflash_output = replace_none_and_null_with_empty_str(data)
def test_large_list_of_dicts_mixed_values(): # Should handle large lists with mixed normal and special values data = [] expected = [] for i in range(500): if i % 3 == 0: d = {"a": None, "b": "null", "c": i} e = {"a": "Not available", "b": "Not available", "c": i} elif i % 3 == 1: d = {"a": "foo", "b": float('nan'), "c": "bar"} e = {"a": "foo", "b": "Not available", "c": "bar"} else: d = {"a": i, "b": "baz", "c": None} e = {"a": i, "b": "baz", "c": "Not available"} data.append(d) expected.append(e) codeflash_output = replace_none_and_null_with_empty_str(data)
def test_large_list_with_required_fields(): # Should fill in missing required fields for large input data = [{"a": 1} for _ in range(500)] expected = [{"a": 1, "b": "Not available", "c": "Not available"} for _ in range(500)] codeflash_output = replace_none_and_null_with_empty_str(data, required_fields=["a", "b", "c"])
def test_large_list_with_non_dict_elements(): # Should leave non-dict elements unchanged in large input data = [{"a": None}] * 250 + [123] * 250 expected = [{"a": "Not available"}] * 250 + [123] * 250 codeflash_output = replace_none_and_null_with_empty_str(data)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes git checkout codeflash/optimize-pr10567-2025-11-14T19.53.40 and push.