PyAirbyte
PyAirbyte copied to clipboard
feat(executors): Add Java connector support to PyAirbyte
Summary
This PR implements Java connector support for PyAirbyte by copying and enhancing the implementation from PR #719. The key addition is a dual-parameter API that provides fine-grained control over Java connector execution:
use_java: Controls Java execution mode (None/True/False/Pathto JRE)use_java_tar: Specifies connector TAR file location (Nonefor auto-detect,Pathfor specific file)
The implementation includes automatic JRE management, connector TAR extraction, and intelligent fallback logic (Docker when available, Java when not). A complete example script demonstrates usage with source-snowflake, including Google Drive TAR download and credential handling patterns.
Core changes:
- New
JavaExecutorclass with automatic JRE download from Azul API - Updated
get_source()API to include both Java parameters - Enhanced executor factory with dual-parameter fallback logic
- Created
examples/run_source_snowflake_java.pydemonstrating real-world usage
Review & Testing Checklist for Human
- [ ] End-to-end testing: Test the Java connector with real Snowflake credentials to verify the complete data reading flow works correctly
- [ ] JRE platform compatibility: Verify JRE download and execution works on different OS/architecture combinations (the current implementation supports Linux/macOS with x64/aarch64)
- [ ] Dual parameter logic: Test various combinations of
use_javaanduse_java_tarparameters to ensure the interaction logic works as expected and documented - [ ] Error handling robustness: Test failure scenarios (missing JRE, corrupted TAR files, network issues) to ensure graceful error handling
- [ ] Google Drive dependency: Verify the source-snowflake TAR file download from Google Drive still works and consider hosting alternatives for production use
Recommended test plan: Run the example script with real Snowflake test credentials, then try variations with different parameter combinations and intentional failure scenarios.
Diagram
%%{ init : { "theme" : "default" }}%%
graph TD
A["airbyte/sources/util.py<br/>get_source()"]:::minor-edit --> B["airbyte/_executors/util.py<br/>get_connector_executor()"]:::minor-edit
B --> C["airbyte/_executors/java.py<br/>JavaExecutor"]:::major-edit
D["examples/run_source_snowflake_java.py<br/>Demo script"]:::major-edit --> A
C --> E["JRE Download<br/>Azul API"]:::context
C --> F["TAR Extraction<br/>~/.airbyte/connectors/"]:::context
C --> G["Java Process<br/>Connector execution"]:::context
subgraph Legend
L1[Major Edit]:::major-edit
L2[Minor Edit]:::minor-edit
L3[Context/No Edit]:::context
end
classDef major-edit fill:#90EE90
classDef minor-edit fill:#87CEEB
classDef context fill:#FFFFFF
Notes
- Implementation source: Based on PR #719 with enhanced dual-parameter API as requested
- Testing limitation: Successfully tested Java connector execution and TAR extraction, but full end-to-end data reading requires real Snowflake credentials
- Platform support: Currently supports Linux/macOS with x64/aarch64 architectures for JRE auto-download
- Fallback behavior: When Java is not explicitly requested, falls back to Docker if available, otherwise uses Java execution
Link to Devin run: https://app.devin.ai/sessions/e9a8bcdfcab246f0857ac38f3755296f
Requested by: @aaronsteers
Summary by CodeRabbit
-
New Features
- Run Java-based connectors with automatic JRE download/management and lifecycle handling.
- New options to select Java execution or provide a connector tarball for sources and destinations (use_java, use_java_tar).
- Simplified access to executors via package-level imports.
-
Documentation
- Added an example demonstrating a Java connector workflow (install, spec, check, discover, read) using a downloaded tarball and the new Java options.