opendal icon indicating copy to clipboard operation
opendal copied to clipboard

feat: support for matching file paths against Unix shell style patterns (glob).

Open RinChanNOWWW opened this issue 2 years ago • 2 comments

Hope OpenDAL to support glob operation like the glob crate.

And both blocking and non-blocking methods are needed.

RinChanNOWWW avatar Jan 29 '23 08:01 RinChanNOWWW

I'm afraid this will be not possible to implement a perfect glob API, by perfect I mean only those objects matching the glob will be transmitted between OpenDAL and underlying storage services.

Given such conclusion, since OpenDAL offers BlockingObjectLister and ObjectLister, I suggest listing with filter.

use futures::TryStreamExt;
use glob::Pattern;
use opendal::{services::Fs, Operator};

#[tokio::main]
async fn main() {
    let mut op_builder = Fs::default();
    op_builder.root("/tmp/opendal/");
    op_builder.atomic_write_dir("/tmp/opendal/");
    let op = Operator::create(op_builder).expect("must success").finish();

    for i in 0..100 {
        let path = format!("valid/dir/test-{}.txt", i);
        op.object(&path).create().await.unwrap();
        let junk = format!("invalid/dir/junk-{}.txt", i);
        op.object(&junk).create().await.unwrap();
    }

    let gm = Pattern::new("valid/dir/test-*.txt").expect("should valid");

    let mut lister = op.object("/").scan().await.unwrap();

    // cannot:
    // while let Some(obj) = lister.try_next().await.unwrap().filter()
    // this will result in early endint of streaming
    while let Some(obj) = lister.try_next().await.unwrap() {
        if gm.matches(obj.path()) {
            println!("{} is valid", obj.path());
        }
    }
}

But, really, it's a little too verbose...

ClSlaid avatar Feb 13 '23 16:02 ClSlaid

Maybe we can provide a API scan_glob()

Xuanwo avatar Feb 13 '23 17:02 Xuanwo

It seems interesting to provide a op.glob("media/**/*.jpg"), users can:

let it = op.glob("media/**/*.jpg").await?;

while let Some(entry) = it.next().await? {
   do_something(&entry)
}

Xuanwo avatar Apr 12 '23 03:04 Xuanwo

It seems interesting to provide a op.glob("media/**/*.jpg"), users can:

let it = op.glob("media/**/*.jpg").await?;

while let Some(entry) = it.next().await? {
   do_something(&entry)
}

This seems interesting, I will have a look then xD.

xyjixyjixyji avatar Apr 13 '23 13:04 xyjixyjixyji

This seems interesting, I will have a look then xD.

I cannot think of a better way of wrapping list with a simple filter.... Just as https://github.com/apache/incubator-opendal/issues/1251#issuecomment-1428267977 said.

xyjixyjixyji avatar Apr 13 '23 14:04 xyjixyjixyji

This seems interesting, I will have a look then xD.

I cannot think of a better way of wrapping list with a simple filter.... Just as https://github.com/apache/incubator-opendal/issues/1251#issuecomment-1428267977 said.

You always have to check one by one.😣

suyanhanx avatar Apr 13 '23 15:04 suyanhanx

You always have to check one by one.😣

Yeah, the sequential scan is unavoidable... This is a good to have feature though, but not that primitive lol. I'm not sure that if opendal should support such higher level operations.

xyjixyjixyji avatar Apr 13 '23 15:04 xyjixyjixyji

I'm not sure that if opendal should support such higher level operations.

OpenDAL is open to adding features that align with our vision. And yes, it would be a good to have feature. Therefore, there is no rush to implement it.

Xuanwo avatar Apr 13 '23 16:04 Xuanwo