feat: Add free-threaded Python 3.14t support with parallel processing (1.55x speedup)
Summary
This PR adds parallel processing support for json2xml, leveraging Python 3.14t's free-threaded capabilities (no-GIL) to achieve up to 1.55x speedup for medium-sized datasets.
π Key Features
- Parallel processing for dictionaries and lists
- Free-threaded Python 3.14t support with GIL-free execution
- Up to 1.55x speedup for medium datasets (100-1K items)
- Automatic fallback to serial processing for small datasets
- Thread-safe XML validation caching
- Zero breaking changes - fully backward compatible
π Benchmark Results
Tested on macOS ARM64 with Python 3.14.0 and Python 3.14.0t:
Medium Dataset (100 items) - Best Case
| Python Version | Serial Time | Parallel (4w) | Speedup |
|---|---|---|---|
| 3.14 (GIL) | 7.56 ms | 7.86 ms | 0.96x |
| 3.14t (no-GIL) | 8.59 ms | 5.55 ms | 1.55x π |
Key Findings:
- β 1.55x speedup on Python 3.14t for medium datasets
- β Automatic detection of free-threaded Python build
- β No benefit on standard Python (as expected due to GIL)
- β Smart fallback avoids overhead for small datasets
See BENCHMARK_RESULTS.md for complete results.
π» Usage
Basic Parallel Processing
from json2xml.json2xml import Json2xml
data = {"users": [{"id": i, "name": f"User {i}"} for i in range(1000)]}
converter = Json2xml(data, parallel=True)
xml = converter.to_xml() # Up to 1.55x faster on Python 3.14t!
Advanced Configuration
converter = Json2xml(
data,
parallel=True, # Enable parallel processing
workers=4, # Use 4 worker threads
chunk_size=100 # Process 100 items per chunk
)
xml = converter.to_xml()
π§ Implementation Details
New Files
-
json2xml/parallel.py- Parallel processing module (318 lines) -
tests/test_parallel.py- Comprehensive test suite (20 tests) -
benchmark.py- Performance benchmarking tool -
docs/performance.rst- Sphinx documentation
Modified Files
-
json2xml/json2xml.py- Addedparallel,workers,chunk_sizeparameters -
json2xml/dicttoxml.py- Integrated parallel processing support -
README.rst- Added performance section with benchmarks -
docs/index.rst- Added performance documentation page
β Testing
All 173 tests passing (153 original + 20 new parallel tests)
pytest -v
# ============================= 173 passed in 0.14s ==============================
- β Zero regressions
- β Full backward compatibility
- β Comprehensive parallel processing validation
- β Edge case handling
- β Thread-safety verification
π Documentation
Complete documentation added:
-
FREE_THREADED_OPTIMIZATION_ANALYSIS.md- Technical analysis and optimization strategy -
BENCHMARK_RESULTS.md- Detailed benchmark results and analysis -
IMPLEMENTATION_SUMMARY.md- Implementation details and architecture -
docs/performance.rst- Sphinx documentation for users - Updated
README.rstwith usage examples and benchmark results
π― Performance Recommendations
When to Use Parallel Processing
Best for:
- Medium datasets (100-1K items)
- Python 3.14t (free-threaded build)
- Complex nested structures
Not recommended for:
- Small datasets (< 100 items) - overhead outweighs benefit
- Standard Python with GIL - no parallel execution possible
Optimal Configuration
# For medium datasets (100-1K items)
converter = Json2xml(data, parallel=True, workers=4)
π Breaking Changes
None - This is a fully backward-compatible change:
- Default behavior unchanged (
parallel=False) - All existing code continues to work without modification
- Parallel processing is opt-in
π§ͺ Running Benchmarks
# Standard Python
uv run --python 3.14 python benchmark.py
# Free-threaded Python
uv run --python 3.14t python benchmark.py
π Checklist
- β Implementation complete
- β All tests passing (173/173)
- β Documentation updated
- β Benchmarks run on both Python versions
- β README updated with performance section
- β Zero breaking changes
- β Backward compatible
- β Code reviewed by Oracle AI
- β Production ready
π Conclusion
This PR makes json2xml ready for Python's free-threaded future while maintaining perfect compatibility with existing code. Users can now opt-in to parallel processing and see significant performance improvements on Python 3.14t!
Related Issues: N/A (proactive optimization)
Type: Feature Enhancement
Impact: Performance improvement, no breaking changes
Summary by Sourcery
Enable optional parallel JSON-to-XML conversion in json2xml leveraging Python 3.14t free-threaded mode, add thread-safe validation caching, provide benchmarks and documentation, and maintain full backward compatibility
New Features:
- Add opt-in parallel processing for JSON-to-XML conversion via
parallel,workers, andchunk_sizeparameters - Support free-threaded Python 3.14t builds with automatic detection and GIL-free execution
- Provide thread-safe caching for XML validation in parallel mode
Enhancements:
- Introduce
json2xml/parallel.pymodule with concurrent dict and list conversion logic - Integrate parallel conversion paths into
dicttoxmlandJson2xmlwhile preserving default serial behavior - Bundle a benchmarking script to measure serial vs. parallel performance
Documentation:
- Update README and Sphinx docs with performance section and usage examples
- Add detailed markdown files for optimization analysis, benchmark results, and implementation summary
Tests:
- Add 20 new tests in
tests/test_parallel.pycovering detection, dict/list parallel conversion, nested data, and order preservation
Reviewer's Guide
This PR introduces an opt-in parallel processing layer for json2xml using Python 3.14tβs free-threaded mode. It adds a dedicated parallel module, extends core conversion functions to dispatch between serial and threaded implementations based on configuration, exposes new API parameters, and provides a full suite of tests, benchmarks, and updated documentationβall while preserving backward compatibility.
Sequence diagram for parallel dict conversion in dicttoxml
sequenceDiagram
participant Caller
participant "dicttoxml.dicttoxml()"
participant "parallel.convert_dict_parallel()"
participant "ThreadPoolExecutor"
participant "_convert_dict_item()"
Caller->>"dicttoxml.dicttoxml()": call with parallel=True, obj is dict
"dicttoxml.dicttoxml()"->>"parallel.convert_dict_parallel()": dispatch for dict
"parallel.convert_dict_parallel()"->>"ThreadPoolExecutor": submit _convert_dict_item for each key
"ThreadPoolExecutor"->>"_convert_dict_item()": process key/value in thread
"_convert_dict_item()"-->>"ThreadPoolExecutor": return XML string
"ThreadPoolExecutor"-->>"parallel.convert_dict_parallel()": collect results
"parallel.convert_dict_parallel()"-->>"dicttoxml.dicttoxml()": return joined XML
"dicttoxml.dicttoxml()"-->>Caller: return XML bytes
Sequence diagram for parallel list conversion in dicttoxml
sequenceDiagram
participant Caller
participant "dicttoxml.dicttoxml()"
participant "parallel.convert_list_parallel()"
participant "ThreadPoolExecutor"
participant "_convert_list_chunk()"
Caller->>"dicttoxml.dicttoxml()": call with parallel=True, obj is list
"dicttoxml.dicttoxml()"->>"parallel.convert_list_parallel()": dispatch for list
"parallel.convert_list_parallel()"->>"ThreadPoolExecutor": submit _convert_list_chunk for each chunk
"ThreadPoolExecutor"->>"_convert_list_chunk()": process chunk in thread
"_convert_list_chunk()"-->>"ThreadPoolExecutor": return XML string
"ThreadPoolExecutor"-->>"parallel.convert_list_parallel()": collect results
"parallel.convert_list_parallel()"-->>"dicttoxml.dicttoxml()": return joined XML
"dicttoxml.dicttoxml()"-->>Caller: return XML bytes
Class diagram for updated Json2xml and dicttoxml API
classDiagram
class Json2xml {
+data: dict[str, Any] | None
+pretty: bool
+attr_type: bool
+item_wrap: bool
+root: str | None
+parallel: bool
+workers: int | None
+chunk_size: int
+to_xml() Any | None
}
class dicttoxml {
+dicttoxml(
obj: Any,
ids: list[str] = [],
custom_root: str = "root",
attr_type: bool = True,
item_func: Callable[[str], str] = default_item_func,
cdata: bool = False,
xml_namespaces: dict[str, Any],
list_headers: bool = False,
parallel: bool = False,
workers: int | None = None,
chunk_size: int = 100
) -> bytes
}
Json2xml --> dicttoxml : uses
Class diagram for new parallel processing module
classDiagram
class parallel {
+is_free_threaded() bool
+get_optimal_workers(workers: int | None) int
+key_is_valid_xml_cached(key: str) bool
+make_valid_xml_name_cached(key: str, attr: dict[str, Any]) tuple[str, dict[str, Any]]
+convert_dict_parallel(
obj: dict[str, Any],
ids: list[str],
parent: str,
attr_type: bool,
item_func: Callable[[str], str],
cdata: bool,
item_wrap: bool,
list_headers: bool = False,
workers: int | None = None,
min_items_for_parallel: int = 10
) str
+convert_list_parallel(
items: Sequence[Any],
ids: list[str] | None,
parent: str,
attr_type: bool,
item_func: Callable[[str], str],
cdata: bool,
item_wrap: bool,
list_headers: bool = False,
workers: int | None = None,
chunk_size: int = 100
) str
}
File-Level Changes
| Change | Details | Files |
|---|---|---|
| Introduce parallel processing infrastructure |
|
json2xml/parallel.py |
| Extend dicttoxml to route to parallel converters |
|
json2xml/dicttoxml.py |
| Expose parallel options in Json2xml API |
|
json2xml/json2xml.py |
| Add comprehensive parallel tests |
|
tests/test_parallel.py |
| Provide benchmarking and documentation support |
|
benchmark.pyFREE_THREADED_OPTIMIZATION_ANALYSIS.mdBENCHMARK_RESULTS.mdREADME.rstdocs/index.rstdocs/performance.rst |
Tips and commands
Interacting with Sourcery
-
Trigger a new review: Comment
@sourcery-ai reviewon the pull request. - Continue discussions: Reply directly to Sourcery's review comments.
-
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with
@sourcery-ai issueto create an issue from it. -
Generate a pull request title: Write
@sourcery-aianywhere in the pull request title to generate a title at any time. You can also comment@sourcery-ai titleon the pull request to (re-)generate the title at any time. -
Generate a pull request summary: Write
@sourcery-ai summaryanywhere in the pull request body to generate a PR summary at any time exactly where you want it. You can also comment@sourcery-ai summaryon the pull request to (re-)generate the summary at any time. -
Generate reviewer's guide: Comment
@sourcery-ai guideon the pull request to (re-)generate the reviewer's guide at any time. -
Resolve all Sourcery comments: Comment
@sourcery-ai resolveon the pull request to resolve all Sourcery comments. Useful if you've already addressed all the comments and don't want to see them anymore. -
Dismiss all Sourcery reviews: Comment
@sourcery-ai dismisson the pull request to dismiss all existing Sourcery reviews. Especially useful if you want to start fresh with a new review - don't forget to comment@sourcery-ai reviewto trigger a new review!
Customizing Your Experience
Access your dashboard to:
- Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
- Change the review language.
- Add, remove or edit custom review instructions.
- Adjust other review settings.
Getting Help
- Contact our support team for questions or feedback.
- Visit our documentation for detailed guides and information.
- Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.
Codecov Report
:x: Patch coverage is 99.33333% with 1 line in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 99.53%. Comparing base (e5ab104) to head (8e5d68a).
:warning: Report is 1 commits behind head on master-freethreaded.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| json2xml/dicttoxml.py | 97.43% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master-freethreaded #256 +/- ##
=======================================================
+ Coverage 99.30% 99.53% +0.23%
=======================================================
Files 3 4 +1
Lines 288 432 +144
=======================================================
+ Hits 286 430 +144
Misses 2 2
| Flag | Coverage Ξ | |
|---|---|---|
| unittests | 99.53% <99.33%> (+0.23%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.