paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[python] Use urlpath.URL to replace pathlib.Path to avoid scheme missing in pypaimon

Open XiaoHongbo-Hope opened this issue 1 month ago • 0 comments

Purpose

Fixes URI scheme loss issue in blob-as-descriptor mode by replacing pathlib.Path with urlpath.URL throughout the codebase. This ensures URI schemes (e.g., oss://, s3://, file://) are preserved when handling blob descriptors and file paths.

Key Changes

  1. Replaced pathlib.Path with urlpath.URL: Unified path handling to preserve URI schemes across all filesystem operations.

  2. Dependency: Added urlpath==2.0.0 as a required dependency.

Impact

  • FileIO: All methods now accept URL type instead of Path
  • FileSystemCatalog: Returns URL objects for database and table paths
  • DataFileMeta.set_file_path: Accepts URL type
  • SchemaManager: Uses URL for table paths
  • UriReader.get_file_path: Returns URL instead of Path

Tests

  • test_blob_write_read_end_to_end_with_descriptor: Verifies URI scheme preservation in blob descriptors
  • Sample script oss_blob_as_descriptor.py: Demonstrates OSS blob-as-descriptor functionality with scheme preservation

XiaoHongbo-Hope avatar Nov 14 '25 02:11 XiaoHongbo-Hope