feat: Implement ZEP 8 URL syntax support for zarr-python
Continuation of @jhamman's #3369 with addressing my review comments Claude summary of the changes on top of Joe's PR:
- Format Propagation ✅
- Zarr format segments (zarr2:, zarr3:) now propagate from URLs to array/group creation
- Modified resolve_url_with_path() to return (Store, str, ZarrFormat | None)
- Updated all API functions (create, open, open_array, open_group) to use URL format when user doesn't specify
- 15 new tests for format propagation with comprehensive on-disk verification
- Nested Adapter Chains ✅
- Implemented recursive URL resolution in ZipAdapter to support nested adapters
- Added _create_nested_zip_store() method that handles nested ZEP 8 URLs
- Enables ZEP 8 spec examples like file:outer.zip|zip:inner.zip|zip:data.zarr
- 8 comprehensive tests covering arrays, groups, edge cases, and complex hierarchies
- Exception Handling Cleanup ✅
- Removed all broad except Exception: handlers from _zep8.py
- Simplified control flow by removing redundant try-except blocks
- Only specific KeyError exceptions caught where appropriate
- Parser exceptions now propagate naturally for easier debugging
- URL Logic Refactoring
- Fixed s3+https scheme handling
- Improved URL validation logic
- Better separation of concerns between parsing and resolution
- Storage Options Validation
- Enhanced storage_options handling and validation
- Better error messages for invalid configurations
Test Coverage: 167 tests passing (1 skipped), up from ~150 tests
Missing ZEP 8 Features Analysis
Based on the ZEP 8 specification, here are the adapter schemes defined vs. implemented:
✅ Currently Implemented (9 adapters)
- file: - FileSystemAdapter
- memory: - MemoryAdapter
- https:, http: - RemoteAdapter
- s3:, s3+http:, s3+https: - S3Adapter (via RemoteAdapter)
- gs: - GCSAdapter (via RemoteAdapter)
- zip: - ZipAdapter
- log: - LoggingAdapter (custom, not in spec)
- zarr2:, zarr3:, zarr: - Format segments (handled by resolution layer)
❌ Missing from Spec (11 adapters)
Storage/Database Adapters:
- ocdbt: - OCDBT format (versioned KV store)
- icechunk: - Icechunk format (versioned Zarr store)
Compression Adapters: 3. gzip: - Transparent gzip decompression 4. zstd: - Transparent zstd decompression
Data Format Adapters: 5. n5: - N5 format support 6. tiff:, jpeg:, png:, bmp:, avif:, webp: - Image format adapters 7. neuroglancer-precomputed: - Neuroglancer format 8. json: - JSON pointer access
Utility Adapters: 9. byte-range:start-end - Byte range extraction 10. ..: - Parent directory traversal (for relative URLs)
Other Missing Features:
- Relative URL pipeline syntax - Spec lines 489-543 (explicitly noted as not supported in zarr-python implementation notes)
- Format auto-detection - Spec lines 420-443 (MAY support, optional feature)
Note: I have moved the URL pipeline proposal over from the ZEP repo to a separate repository:
https://github.com/jbms/url-pipeline