fio icon indicating copy to clipboard operation
fio copied to clipboard

No option to unset http_s3_region for S3 interface utilization

Open fbalak opened this issue 4 years ago • 2 comments

Some s3 compatible interfaces don't have set region and return error when some region is set (e.g. RGW without set region).

There should be a way how to not provide http_s3_region parameter in S3 communication when using http ioengine. If user currently doesn't provide region then there is used us-east-1 by default.

fbalak avatar Dec 16 '20 13:12 fbalak

To look into this, I used ceph nano to provide me a local Ceph RGW S3 compatible endpoint for testing. Such setup doesn't have regions.

When I briefly tweaked examples/http-s3.fio job and try to run it with my local endpoint, I ended up with an error (which this issue seems to expect).

First of all I tried to check if the region is really causing the problem as suggested. Using http_verbose=1 I noticed that the error fio fails on is HTTP/1.1 404 Not Found. Checking fio source code I realized that the region is used for HMAC and Authorization http header only. Moreover when I tried to patch the region code out, fio failed on 400 error instead, suggesting that the region is actually needed (RGW is rejecting a request without a region as invalid). That said, it's possible that my attempt was to disruptive and a region could be omitted in a better way.

When I created a bucket specified in the fio job first instead, the fio job finished just fine.

Conclusion

I believe that it ~doesn't make sense~ may not be necessary to provide a way how to unset the default region, since even regionless Ceph RGW would ignore it and work just fine. Moreover it's possible to configure regions with RGW as well.

That said, there is an opportunity for extending documentation to make this (that a bucket specified in a fio job has to already exist) clear (assuming I'm correct here).

Maybe it would make sense for fio to create a bucket if it doesn't already exist. But I assume that in most cases, users would like to create a bucket themselves, because then they have full control over it's configuration.

Example and details

Simple fio job:

$ cat http-s3.simple.fio
[global]
name=rgw-s3
ioengine=http
direct=1
filename=/fio/object
https=off
# http_verbose=1
http_mode=s3
http_host=192.168.122.136:8000
http_s3_key=18r1C5dqbwDfaQFjlAk7a7RnJAfaoCcnqqBBR3H8
http_s3_keyid=MIR3K9565ZFK3PIC5ZNG

[write-verify]
rw=write
size=64k
verify=sha256

Note filename=/fio/object means that fio will write keys with object prefix into bucket fio. This bucket needs to exist.

$ fio -f http-s3.simple.fio
write-verify: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=http, iodepth=1
fio-3.21
Starting 1 process
3;fio-3.21;write-verify;0;0;64;1103;275;58;0;0;0.000000;0.000000;1954;6506;3521.426937;1213.620310;1.000000%=1957;5.000000%=1957;10.000000%=2113;20.000000%=2506;30.000000%=2605;40.000000%=3129;50.000000%=3325;60.000000%=3784;70.000000%=3948;80.000000%=4079;90.000000%=5013;95.000000%=6520;99.000000%=6520;99.500000%=6520;99.900000%=6520;99.950000%=6520;99.990000%=6520;0%=0;0%=0;0%=0;1954;6506;3521.899438;1213.632024;0;0;0.000000%;0.000000;0.000000;64;130;32;489;0;0;0.000000;0.000000;23483;40419;30437.507063;3847.894179;1.000000%=23461;5.000000%=23461;10.000000%=26607;20.000000%=28442;30.000000%=28966;40.000000%=29491;50.000000%=29753;60.000000%=29753;70.000000%=31326;80.000000%=33816;90.000000%=33816;95.000000%=40632;99.000000%=40632;99.500000%=40632;99.900000%=40632;99.950000%=40632;99.990000%=40632;0%=0;0%=0;0%=0;23559;40515;30539.464563;3856.068193;128;128;98.461538%;128.000000;0.000000;2.747253%;1.098901%;79;0;26;100.0%;0.0%;0.0%;0.0%;0.0%;0.0%;0.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;3.12%;34.38%;12.50%;0.00%;50.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%

write-verify: (groupid=0, jobs=1): err= 0: pid=525533: Sat May 15 22:27:05 2021
  read: IOPS=275, BW=1103KiB/s (1130kB/s)(64.0KiB/58msec)
    clat (usec): min=1954, max=6506, avg=3521.43, stdev=1213.62
     lat (usec): min=1954, max=6506, avg=3521.90, stdev=1213.63
    clat percentiles (usec):
     |  1.00th=[ 1958],  5.00th=[ 1958], 10.00th=[ 2114], 20.00th=[ 2507],
     | 30.00th=[ 2606], 40.00th=[ 3130], 50.00th=[ 3326], 60.00th=[ 3785],
     | 70.00th=[ 3949], 80.00th=[ 4080], 90.00th=[ 5014], 95.00th=[ 6521],
     | 99.00th=[ 6521], 99.50th=[ 6521], 99.90th=[ 6521], 99.95th=[ 6521],
     | 99.99th=[ 6521]
  write: IOPS=32, BW=131KiB/s (134kB/s)(64.0KiB/489msec); 0 zone resets
    clat (usec): min=23483, max=40419, avg=30437.51, stdev=3847.89
     lat (usec): min=23559, max=40515, avg=30539.46, stdev=3856.07
    clat percentiles (usec):
     |  1.00th=[23462],  5.00th=[23462], 10.00th=[26608], 20.00th=[28443],
     | 30.00th=[28967], 40.00th=[29492], 50.00th=[29754], 60.00th=[29754],
     | 70.00th=[31327], 80.00th=[33817], 90.00th=[33817], 95.00th=[40633],
     | 99.00th=[40633], 99.50th=[40633], 99.90th=[40633], 99.95th=[40633],
     | 99.99th=[40633]
   bw (  KiB/s): min=  128, max=  128, per=98.46%, avg=128.00, stdev= 0.00, samples=1
   iops        : min=   32, max=   32, avg=32.00, stdev= 0.00, samples=1
  lat (msec)   : 2=3.12%, 4=34.38%, 10=12.50%, 50=50.00%
  cpu          : usr=2.75%, sys=1.10%, ctx=79, majf=0, minf=26
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=16,16,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=1103KiB/s (1130kB/s), 1103KiB/s-1103KiB/s (1130kB/s-1130kB/s), io=64.0KiB (65.5kB), run=58-58msec
  WRITE: bw=131KiB/s (134kB/s), 131KiB/s-131KiB/s (134kB/s-134kB/s), io=64.0KiB (65.5kB), run=489-489msec

We can check what fio did via this script:

$ cat list_fio.py
#!/usr/bin/env python3
# -*- coding: utf8 -*-

from boto.s3.connection import S3Connection, OrdinaryCallingFormat

ACCESS_KEY = 'MIR3K9565ZFK3PIC5ZNG'
SECRET_KEY = '18r1C5dqbwDfaQFjlAk7a7RnJAfaoCcnqqBBR3H8'
RGW_HOST = '192.168.122.136'
RGW_PORT = 8000

conn = S3Connection(
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    is_secure=False,
    host=RGW_HOST,
    port=RGW_PORT,
    calling_format=OrdinaryCallingFormat(),
    )

fio_bucket = conn.get_bucket('fio')
for key in fio_bucket.list():
    print(key.name)

And we see list of objects in fio bucket, as expected:

$ ./list_fio.py 
object_0_4096
object_12288_4096
object_16384_4096
object_20480_4096
object_24576_4096
object_28672_4096
object_32768_4096
object_36864_4096
object_40960_4096
object_4096_4096
object_45056_4096
object_49152_4096
object_53248_4096
object_57344_4096
object_61440_4096
object_8192_4096

marbu avatar May 15 '21 20:05 marbu