juicefs icon indicating copy to clipboard operation
juicefs copied to clipboard

Allow utilizing multiple buckets as backend storage endpoints

Open cayolblake opened this issue 3 years ago • 11 comments

What would you like to be added:

Allow utilizing multiple buckets as backend storage.

#=============

Why is this needed:

  1. Workaround rate limits for many providers which is applied on the bucket level which will massively enhance I/O performance on the juicefs mount points.
  2. Enhancing high-availability with possibility of replicating a single block to one or more bucket which ends up increasing uptime and reliability.
  3. Allow using multiple providers which increases end-users service reliability and credibility.

#=============

cayolblake avatar Apr 23 '21 20:04 cayolblake

According to this S3 document,

your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.

if you create 10 prefixes in an Amazon S3 bucket to parallelize reads, you could scale your read performance to 55,000 read requests per second. Similarly, you can scale write operations by writing to multiple prefixes.

Reading and writing can be scaled by creating multiple prefixes in a bucket.

suzaku avatar Apr 23 '21 22:04 suzaku

Well, assuming that's for AWS S3, it's not the case for the majority of other providers.

Also the other points in subject.

cayolblake avatar Apr 24 '21 01:04 cayolblake

Sounds good, we can have raid 1( replicated across multiple bucket/provider), and raid 0 (hash to multiple buckets).

The difficulty is to define a interface that's backward compatible and easy to understand, both on user interface and implementation (this is easy).

davies avatar Apr 25 '21 03:04 davies

Yes totally agree.

RAID 0/1 like would give endless possibilities and mark juicefs for a new era of software defined file systems.

Also it would allow people to create outstanding ideas like layering a RAID 1 setup over a RAID 0 setup or vice versa to solve real-life problems in a smart easy way.

cayolblake avatar Apr 25 '21 15:04 cayolblake

RAID 0 is implemented in #349

davies avatar Apr 26 '21 16:04 davies

RAID 0 is implemented in #349

Great work with this. One question: In the case of running RAID 0 against local minio instances, is there a way to replace a minio server? Or would we be "stuck" with the DNS/IP addresses provided to the format command?

jkiebzak avatar Oct 26 '21 14:10 jkiebzak

We can use format command to update the DNS/IP, once we can make sure that the number and order of DNS/IP are the same as before.

davies avatar Oct 26 '21 14:10 davies

We can use format command to update the DNS/IP, once we can make sure that the number and order of DNS/IP are the same as before.

Does it means if it updates the original single DNS/IP to the load balance DNS/IP which points at the same bucket cluster, it will work?

0akarma avatar Sep 24 '22 03:09 0akarma

Does it means if it updates the original single DNS/IP to the load balance DNS/IP which points at the same bucket cluster, it will work?

yes

davies avatar Sep 24 '22 12:09 davies

Is it possible to utilise multiple endpoints in the famous RAID0? e.g. different providers

eleaner avatar Nov 16 '22 22:11 eleaner

@davies if there's a way to achieve that it would be great.

akiirax avatar Dec 01 '23 16:12 akiirax