Theo Li

Results 15 issues of Theo Li

I ran into a problem ```python from abc import ABC, abstractmethod from typing import Generic, TypeVar T = TypeVar('T') class A(Generic[T], ABC): @abstractmethod def __getitem__(self, index: int) -> T: ......

bug
cat: generics

目前 megfile 认为 文件系统中 symlink 是一个文件,s3 中不存在 symlink 期望修改为,文件系统中 symlink 行为和标准库一致,s3 中 symlink 使用对象 headers 中特殊的 key 来标注,行为模仿文件系统中的 symlink

enhancement

在 s3_buffered_open buffered=True 变成默认值后,随机访问的性能下降了许多,需要想办法提回来

optimization

控制碰到已存在的文件时的行为

enhancement

协议 - [x] webdav - [x] ssh / sftp - [ ] ftp - [x] hdfs / webhdfs - [ ] git 网盘 - [ ] azure - [ ]...

enhancement

例如用 mtime + size 来做一个

enhancement

**Bug Description** My setup consists of a server in a DMZ that handles authentication with OAuth, and sets a cookie. Once this cookie is found, requests are proxied to the...

wontfix

### What happened + What you expected to happen `ray.data.Dataset.write_parquet(**arrow_parquet_args)` `arrow_parquet_args` does not work anymore, since it has not passed to `pq.ParquetWriter` here https://github.com/ray-project/ray/blob/d9e795c17a6d4fe61fa57f691c9bcc60dcace72e/python/ray/data/datasource/parquet_datasink.py#L75 And so does `arrow_parquet_args_fn` ### Versions...

bug
triage
data

https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html aws s3 实现的 multi part upload 最多上传 10000 个分片,国内几个云厂商也遵守这个约定,导致 megfile 最坏情况只能写 80GB 大小的文件 新开 DEFAULT_MIN_BLOCK_SIZE 配置项,单独控制文件上传时分片的最小尺寸,因为不想升大版本,尽可能的保持了之前的参数名

Sample code ```python import os from s3fs import S3FileSystem fs = S3FileSystem() base_path = "s3://moonshot-train-data/test_data/test_s3fs/" def join_path(*args): return os.path.join(base_path, *args) def test_exists(path): print(path, fs.exists(os.path.join(path))) fs.touch(join_path("a/b/c.txt")) print("=== before glob ===") test_exists(join_path("a/b/c.txt"))...