chorus icon indicating copy to clipboard operation
chorus copied to clipboard

Unable to replicate buckets

Open Aashiqps opened this issue 7 months ago • 4 comments

I have installed chorus using helm and also configured chorus cli with exposing grpcAPI using nodeport. after creating a replication rule it is not replicating. only changes done in values.yaml are in below sections. can anybody help to identify the issue.

commonConfig:
  features:
    tagging: true
    acl: true
  storage:
    createRouting: true # create routing rules to route proxy requests to main storage
    createReplication: false # create replication rules to replicate data from main to other storages
    storages:
      one:
        address: <source-address>
        provider: MinIO
        isMain: true
        isSecure: true
      two:
        address: <destination-address>
        provider: MinIO
        isSecure: true


secret: |-
  storage:
    storages:
      one:
        address: <source-address>
        provider: MinIO
        isMain: true
        isSecure: true
        credentials:
          user1:
            accessKeyID: xxxx
            secretAccessKey: xxxx
      two:
        address: <destination-address>
        provider: MinIO
        isMain: false
        isSecure: true
        credentials:
          user1:
            accessKeyID: xxxx
            secretAccessKey: xxxx

Aashiqps avatar May 22 '25 14:05 Aashiqps

config looks correct.

  • was replication correctly created? is it shown in list replications api response?
  • is there any errors in chorus worker logs?

arttor avatar May 22 '25 17:05 arttor

@arttor yes replication list is showing. there is an error in logs

root@redis1:~/chorus# ./chorctl repl list
NAME                           PROGRESS                 SIZE        OBJECTS     EVENTS     PAUSED     AGE        HAS_SWITCH
user1:test-chorus:one->two     [          ]   0.0 %     0 B/0 B     0/0         0/0        false      18h59m     false
root@redis1:~/chorus# 
worker pod logs

time="2025-05-23T08:41:01Z" level=error msg="Failed to copy: failed to open source object: SerializationError: failed to unmarshal error message\n\tstatus code: 403, request id: 18421B55E35424BA, host id: c55487a536888a4813e678ec2578b042c417d2ca81d32a195017466c8aa697db\ncaused by: UnmarshalError: failed to unmarshal error message\n\t00000000  1f 8b 08 00 00 00 00 00  00 03 4d 91 d1 6e c2 30  |..........M..n.0|\n00000010  0c 45 df f7 15 56 df 69  48 9b b4 41 0a 41 83 31  |.E...V.iH..A.A.1|\n00000020  6d 9a e0 81 b1 0f 08 89  d5 56 d0 66 4b 52 06 7f  |m........V.fKR..|\n00000030  3f 15 c6 c6 93 e5 7b 8f  af 65 59 ce 4e ed 01 8e  |?.....{..eY.N...|\n00000040  e8 43 e3 ba 69 42 d3 71  02 d8 19 67 9b ae 9a 26  |.C..iB.q...g...&|\n00000050  1f db e7 91 48 66 ea 41  2e bd 77 5e c9 85 b3 a8  |....Hf.A..w^....|\n00000060  de 9b aa d3 b1 f7 f8 e4  30 ac 5d 5c e9 68 6a 49  |........0.]\\.hjI|\n00000070  2e 9e 5c 61 08 ba 42 b5  ad 11 3c 7e f5 18 22 84  |..\\a..B...<~..\".|\n00000080  db 00 7c 23 18 7d 30 fd  41 47 b4 60 1d 06 e8 5c  |..|#.}0.AG.`...\\|\n00000090  84 76 48 80 58 e3 1d 7a  76 3d 7c 7a 77 6c 2c da  |.vH.X..zv=|zwl,.|\n000000a0  14 16 35 9a fd a0 79 d8  e3 19 74 67 2f 68 d3 55  |..5...y...tg/h.U|\n000000b0  d0 62 ac 9d 4d 25 b9 ad  96 6f 78 56 cd 0e 31 8d  |.b..M%...oxV..1.|\n000000c0  a7 28 c9 d0 c9 79 6f f6  18 d7 ba 45 15 31 c4 91  |.(...yo....E.1..|\n000000d0  a9 9d ef 83 24 77 ba dc  60 70 bd 37 a8 c8 1d 42  |....$w..`p.7...B|\n000000e0  fe 83 fe 7c b9 b9 1e f6  6a 15 15 2c a3 73 ce 97  |...|....j..,.s..|\n000000f0  39 67 19 9b 3f 0e d4 cd  93 2f ee 52 0d e7 4c 94  |9g..?..../.R..L.|\n00000100  9a e7 85 10 42 33 41 73  2c 4a 81 26 e3 a5 d8 8d  |....B3As,J.&....|\n00000110  59 66 18 2d 6d 66 b4 a0  36 cf 34 9d f0 31 2d 59  |Yf.-mf..6.4..1-Y|\n00000120  51 18 a1 75 31 29 ed 4e  92 df 20 49 ae 6f f8 01  |Q..u1).N.. I.o..|\n00000130  b6 1f 3e 08 b5 01 00 00                           |..>.....|\n\ncaused by: XML syntax error on line 2: invalid character entity & (no semicolon)" object=ibee.txt objectType="*s3.Object"
{"level":"warn","app_id":"d0nk8kat9nuc73bl4330","app":"worker","task_type":"migrate:object:copy","task_id":"mgr:co:one:two:test-chorus:ibee.txt","task_queue":"migrate_obj_copy1","task_max_retry":25,"task_retried":13,"error":"migration obj copy: unable to copy with rclone: failed to open source object: SerializationError: failed to unmarshal error message\n\tstatus code: 403, request id: 18421B55E35424BA, host id: c55487a536888a4813e678ec2578b042c417d2ca81d32a195017466c8aa697db\ncaused by: UnmarshalError: failed to unmarshal error message\n\t00000000  1f 8b 08 00 00 00 00 00  00 03 4d 91 d1 6e c2 30  |..........M..n.0|\n00000010  0c 45 df f7 15 56 df 69  48 9b b4 41 0a 41 83 31  |.E...V.iH..A.A.1|\n00000020  6d 9a e0 81 b1 0f 08 89  d5 56 d0 66 4b 52 06 7f  |m........V.fKR..|\n00000030  3f 15 c6 c6 93 e5 7b 8f  af 65 59 ce 4e ed 01 8e  |?.....{..eY.N...|\n00000040  e8 43 e3 ba 69 42 d3 71  02 d8 19 67 9b ae 9a 26  |.C..iB.q...g...&|\n00000050  1f db e7 91 48 66 ea 41  2e bd 77 5e c9 85 b3 a8  |....Hf.A..w^....|\n00000060  de 9b aa d3 b1 f7 f8 e4  30 ac 5d 5c e9 68 6a 49  |........0.]\\.hjI|\n00000070  2e 9e 5c 61 08 ba 42 b5  ad 11 3c 7e f5 18 22 84  |..\\a..B...<~..\".|\n00000080  db 00 7c 23 18 7d 30 fd  41 47 b4 60 1d 06 e8 5c  |..|#.}0.AG.`...\\|\n00000090  84 76 48 80 58 e3 1d 7a  76 3d 7c 7a 77 6c 2c da  |.vH.X..zv=|zwl,.|\n000000a0  14 16 35 9a fd a0 79 d8  e3 19 74 67 2f 68 d3 55  |..5...y...tg/h.U|\n000000b0  d0 62 ac 9d 4d 25 b9 ad  96 6f 78 56 cd 0e 31 8d  |.b..M%...oxV..1.|\n000000c0  a7 28 c9 d0 c9 79 6f f6  18 d7 ba 45 15 31 c4 91  |.(...yo....E.1..|\n000000d0  a9 9d ef 83 24 77 ba dc  60 70 bd 37 a8 c8 1d 42  |....$w..`p.7...B|\n000000e0  fe 83 fe 7c b9 b9 1e f6  6a 15 15 2c a3 73 ce 97  |...|....j..,.s..|\n000000f0  39 67 19 9b 3f 0e d4 cd  93 2f ee 52 0d e7 4c 94  |9g..?..../.R..L.|\n00000100  9a e7 85 10 42 33 41 73  2c 4a 81 26 e3 a5 d8 8d  |....B3As,J.&....|\n00000110  59 66 18 2d 6d 66 b4 a0  36 cf 34 9d f0 31 2d 59  |Yf.-mf..6.4..1-Y|\n00000120  51 18 a1 75 31 29 ed 4e  92 df 20 49 ae 6f f8 01  |Q..u1).N.. I.o..|\n00000130  b6 1f 3e 08 b5 01 00 00                           |..>.....|\n\ncaused by: XML syntax error on line 2: invalid character entity & (no semicolon)","caller":"/build/service/worker/server.go:174","time":"2025-05-23T08:41:01Z","message":"process task failed. task will be retried"}

Aashiqps avatar May 23 '25 10:05 Aashiqps

it looks like chorus worked receives 403 error when tries to get object from the source storage. can you get ibee.txt from test-chorus bucket from storage one with credentials from config?

it is also possible that storage is not accessible from machine running chorus. you can try to access storage from your k8s cluster.

arttor avatar May 23 '25 11:05 arttor

I'm experiencing the same replication issue described in this thread.
Running in docker-compose with 2 Ceph system in the back. "app_id":"d1e0dm8pufgs73e4ma90","app":"worker","version":"development","commit":"not set" Unsure if the images are built or pulled, but the repo has the commit 8283184355be4ff16f08faf87cfa117b17c90440. Worker tasks are failing with 403 Forbidden errors during object migration, despite manual rclone operations working perfectly with the same credentials and endpoints.

worker-1  | 2025-06-25T14:09:31Z WRN ../build/service/worker/server.go:174 > process task failed. task will be retried error="migration obj copy: unable to copy with rclone: Forbidden: Forbidden\n\tstatus code: 403, request id: tx00000512ef6c6ffdc8464-00685c032c-10115544-default, host id: " app=worker app_id=d1e02hgj855s73aemh20 task_id=mgr:co:cephsource:cephdest:ceph-s3upload:uploads/var/log/audit/audit.log task_max_retry=25 task_queue=migrate_obj_copy1 task_retried=13 task_type=migrate:object:copy

Metrics Showing 100% Failure Rate

worker_failed_tasks_total{queue="migrate_obj_copy1",task_type="migrate:object:copy"} 32033
worker_processed_tasks_total{queue="migrate_obj_copy1",task_type="migrate:object:copy"} 32033
worker_in_progress_tasks{queue="migrate_obj_copy1",task_type="migrate:object:copy"} 0

I tryed to confirm that credentials are valid, network connectivity is working, Ceph permissions are correct, endpoints are reachable. Manual rclone operations work - e.g. rclone copy source:bucket/file dest:bucket/file

will pull another version tomorrow and see where things are going

Underknowledge avatar Jun 25 '25 18:06 Underknowledge