json2xml feat: Add free-threaded Python 3.14t support with parallel processing (1.55x speedup)

Summary

This PR adds parallel processing support for json2xml, leveraging Python 3.14t's free-threaded capabilities (no-GIL) to achieve up to 1.55x speedup for medium-sized datasets.

🚀 Key Features

Parallel processing for dictionaries and lists
Free-threaded Python 3.14t support with GIL-free execution
Up to 1.55x speedup for medium datasets (100-1K items)
Automatic fallback to serial processing for small datasets
Thread-safe XML validation caching
Zero breaking changes - fully backward compatible

📊 Benchmark Results

Tested on macOS ARM64 with Python 3.14.0 and Python 3.14.0t:

Medium Dataset (100 items) - Best Case

Python Version	Serial Time	Parallel (4w)	Speedup
3.14 (GIL)	7.56 ms	7.86 ms	0.96x
3.14t (no-GIL)	8.59 ms	5.55 ms	1.55x 🚀

Key Findings:

✅ 1.55x speedup on Python 3.14t for medium datasets
✅ Automatic detection of free-threaded Python build
✅ No benefit on standard Python (as expected due to GIL)
✅ Smart fallback avoids overhead for small datasets

See BENCHMARK_RESULTS.md for complete results.

💻 Usage

Basic Parallel Processing

from json2xml.json2xml import Json2xml

data = {"users": [{"id": i, "name": f"User {i}"} for i in range(1000)]}
converter = Json2xml(data, parallel=True)
xml = converter.to_xml()  # Up to 1.55x faster on Python 3.14t!

Advanced Configuration

converter = Json2xml(
    data,
    parallel=True,    # Enable parallel processing
    workers=4,        # Use 4 worker threads
    chunk_size=100    # Process 100 items per chunk
)
xml = converter.to_xml()

🔧 Implementation Details

New Files

json2xml/parallel.py - Parallel processing module (318 lines)
tests/test_parallel.py - Comprehensive test suite (20 tests)
benchmark.py - Performance benchmarking tool
docs/performance.rst - Sphinx documentation

Modified Files

json2xml/json2xml.py - Added parallel, workers, chunk_size parameters
json2xml/dicttoxml.py - Integrated parallel processing support
README.rst - Added performance section with benchmarks
docs/index.rst - Added performance documentation page

✅ Testing

All 173 tests passing (153 original + 20 new parallel tests)

pytest -v
# ============================= 173 passed in 0.14s ==============================

✅ Zero regressions
✅ Full backward compatibility
✅ Comprehensive parallel processing validation
✅ Edge case handling
✅ Thread-safety verification

📚 Documentation

Complete documentation added:

FREE_THREADED_OPTIMIZATION_ANALYSIS.md - Technical analysis and optimization strategy
BENCHMARK_RESULTS.md - Detailed benchmark results and analysis
IMPLEMENTATION_SUMMARY.md - Implementation details and architecture
docs/performance.rst - Sphinx documentation for users
Updated README.rst with usage examples and benchmark results

🎯 Performance Recommendations

When to Use Parallel Processing

Best for:

Medium datasets (100-1K items)
Python 3.14t (free-threaded build)
Complex nested structures

Not recommended for:

Small datasets (< 100 items) - overhead outweighs benefit
Standard Python with GIL - no parallel execution possible

Optimal Configuration

# For medium datasets (100-1K items)
converter = Json2xml(data, parallel=True, workers=4)

🔄 Breaking Changes

None - This is a fully backward-compatible change:

Default behavior unchanged (parallel=False)
All existing code continues to work without modification
Parallel processing is opt-in

🧪 Running Benchmarks

# Standard Python
uv run --python 3.14 python benchmark.py

# Free-threaded Python
uv run --python 3.14t python benchmark.py

📋 Checklist

✅ Implementation complete
✅ All tests passing (173/173)
✅ Documentation updated
✅ Benchmarks run on both Python versions
✅ README updated with performance section
✅ Zero breaking changes
✅ Backward compatible
✅ Code reviewed by Oracle AI
✅ Production ready

🎉 Conclusion

This PR makes json2xml ready for Python's free-threaded future while maintaining perfect compatibility with existing code. Users can now opt-in to parallel processing and see significant performance improvements on Python 3.14t!

Related Issues: N/A (proactive optimization)
Type: Feature Enhancement
Impact: Performance improvement, no breaking changes

Summary by Sourcery

Enable optional parallel JSON-to-XML conversion in json2xml leveraging Python 3.14t free-threaded mode, add thread-safe validation caching, provide benchmarks and documentation, and maintain full backward compatibility

New Features:

Add opt-in parallel processing for JSON-to-XML conversion via parallel, workers, and chunk_size parameters
Support free-threaded Python 3.14t builds with automatic detection and GIL-free execution
Provide thread-safe caching for XML validation in parallel mode

Enhancements:

Introduce json2xml/parallel.py module with concurrent dict and list conversion logic
Integrate parallel conversion paths into dicttoxml and Json2xml while preserving default serial behavior
Bundle a benchmarking script to measure serial vs. parallel performance

Documentation:

Update README and Sphinx docs with performance section and usage examples
Add detailed markdown files for optimization analysis, benchmark results, and implementation summary

Tests:

Add 20 new tests in tests/test_parallel.py covering detection, dict/list parallel conversion, nested data, and order preservation

Oct 23 '25 21:10 vinitkumar

Reviewer's Guide

This PR introduces an opt-in parallel processing layer for json2xml using Python 3.14t’s free-threaded mode. It adds a dedicated parallel module, extends core conversion functions to dispatch between serial and threaded implementations based on configuration, exposes new API parameters, and provides a full suite of tests, benchmarks, and updated documentation—all while preserving backward compatibility.

Sequence diagram for parallel dict conversion in dicttoxml

sequenceDiagram
    participant Caller
    participant "dicttoxml.dicttoxml()"
    participant "parallel.convert_dict_parallel()"
    participant "ThreadPoolExecutor"
    participant "_convert_dict_item()"
    Caller->>"dicttoxml.dicttoxml()": call with parallel=True, obj is dict
    "dicttoxml.dicttoxml()"->>"parallel.convert_dict_parallel()": dispatch for dict
    "parallel.convert_dict_parallel()"->>"ThreadPoolExecutor": submit _convert_dict_item for each key
    "ThreadPoolExecutor"->>"_convert_dict_item()": process key/value in thread
    "_convert_dict_item()"-->>"ThreadPoolExecutor": return XML string
    "ThreadPoolExecutor"-->>"parallel.convert_dict_parallel()": collect results
    "parallel.convert_dict_parallel()"-->>"dicttoxml.dicttoxml()": return joined XML
    "dicttoxml.dicttoxml()"-->>Caller: return XML bytes

Sequence diagram for parallel list conversion in dicttoxml

sequenceDiagram
    participant Caller
    participant "dicttoxml.dicttoxml()"
    participant "parallel.convert_list_parallel()"
    participant "ThreadPoolExecutor"
    participant "_convert_list_chunk()"
    Caller->>"dicttoxml.dicttoxml()": call with parallel=True, obj is list
    "dicttoxml.dicttoxml()"->>"parallel.convert_list_parallel()": dispatch for list
    "parallel.convert_list_parallel()"->>"ThreadPoolExecutor": submit _convert_list_chunk for each chunk
    "ThreadPoolExecutor"->>"_convert_list_chunk()": process chunk in thread
    "_convert_list_chunk()"-->>"ThreadPoolExecutor": return XML string
    "ThreadPoolExecutor"-->>"parallel.convert_list_parallel()": collect results
    "parallel.convert_list_parallel()"-->>"dicttoxml.dicttoxml()": return joined XML
    "dicttoxml.dicttoxml()"-->>Caller: return XML bytes

Class diagram for updated Json2xml and dicttoxml API

classDiagram
    class Json2xml {
        +data: dict[str, Any] | None
        +pretty: bool
        +attr_type: bool
        +item_wrap: bool
        +root: str | None
        +parallel: bool
        +workers: int | None
        +chunk_size: int
        +to_xml() Any | None
    }
    class dicttoxml {
        +dicttoxml(
            obj: Any,
            ids: list[str] = [],
            custom_root: str = "root",
            attr_type: bool = True,
            item_func: Callable[[str], str] = default_item_func,
            cdata: bool = False,
            xml_namespaces: dict[str, Any],
            list_headers: bool = False,
            parallel: bool = False,
            workers: int | None = None,
            chunk_size: int = 100
        ) -> bytes
    }
    Json2xml --> dicttoxml : uses

Class diagram for new parallel processing module

classDiagram
    class parallel {
        +is_free_threaded() bool
        +get_optimal_workers(workers: int | None) int
        +key_is_valid_xml_cached(key: str) bool
        +make_valid_xml_name_cached(key: str, attr: dict[str, Any]) tuple[str, dict[str, Any]]
        +convert_dict_parallel(
            obj: dict[str, Any],
            ids: list[str],
            parent: str,
            attr_type: bool,
            item_func: Callable[[str], str],
            cdata: bool,
            item_wrap: bool,
            list_headers: bool = False,
            workers: int | None = None,
            min_items_for_parallel: int = 10
        ) str
        +convert_list_parallel(
            items: Sequence[Any],
            ids: list[str] | None,
            parent: str,
            attr_type: bool,
            item_func: Callable[[str], str],
            cdata: bool,
            item_wrap: bool,
            list_headers: bool = False,
            workers: int | None = None,
            chunk_size: int = 100
        ) str
    }

File-Level Changes

Change	Details	Files
Introduce parallel processing infrastructure	Add json2xml/parallel.py with free-threaded detection and thread pool utilities Implement parallel convert_dict and convert_list functions with order preservation Add thread-safe XML validation caching and name sanitization helpers	`json2xml/parallel.py`
Extend dicttoxml to route to parallel converters	Add parallel, workers, and chunk_size parameters to dicttoxml signature Branch logic to invoke convert_dict_parallel or convert_list_parallel when parallel=True Maintain original serial conversion path for fallback	`json2xml/dicttoxml.py`
Expose parallel options in Json2xml API	Add parallel, workers, and chunk_size parameters to Json2xml.init Pass new parameters through to dicttoxml invocation in to_xml Ensure default serial behavior remains unchanged	`json2xml/json2xml.py`
Add comprehensive parallel tests	Create tests/test_parallel.py with 20 tests covering feature detection, small/large data, nested structures, and API integration Validate fallback to serial path, order preservation, and special character handling	`tests/test_parallel.py`
Provide benchmarking and documentation support	Add benchmark.py for performance measurement across dataset sizes and thread counts Introduce FREE_THREADED_OPTIMIZATION_ANALYSIS.md and BENCHMARK_RESULTS.md Update README.rst, docs/index.rst, and add docs/performance.rst with usage and benchmark details	`benchmark.py` `FREE_THREADED_OPTIMIZATION_ANALYSIS.md` `BENCHMARK_RESULTS.md` `README.rst` `docs/index.rst` `docs/performance.rst`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it. You can also reply to a review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time. You can also comment @sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment @sourcery-ai summary on the pull request to (re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

Oct 23 '25 21:10 sourcery-ai[bot]

Codecov Report

:x: Patch coverage is 99.33333% with 1 line in your changes missing coverage. Please review. :white_check_mark: Project coverage is 99.53%. Comparing base (e5ab104) to head (8e5d68a). :warning: Report is 1 commits behind head on master-freethreaded.

Files with missing lines	Patch %	Lines
json2xml/dicttoxml.py	97.43%	1 Missing :warning:

Additional details and impacted files

@@                   Coverage Diff                   @@
##           master-freethreaded     #256      +/-   ##
=======================================================
+ Coverage                99.30%   99.53%   +0.23%     
=======================================================
  Files                        3        4       +1     
  Lines                      288      432     +144     
=======================================================
+ Hits                       286      430     +144     
  Misses                       2        2

Flag	Coverage Δ
unittests	`99.53% <99.33%> (+0.23%)`	:arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Oct 23 '25 21:10 codecov[bot]