grist-core icon indicating copy to clipboard operation
grist-core copied to clipboard

Grist+MinIO self-hosted : document originating from template : snaphots unavailable

Open raph-topo opened this issue 9 months ago • 19 comments

Describe the current behavior

Hi! Thanks for this incredible tool!

Steps to reproduce

A

  1. Home > Create Empty Document
  2. Change some cells
  3. Exit to home … Open document
  4. Document History > At least 2 snapshots appear ✅
  5. MinIO Object Browser > /_docId_.grist file with multiple versions ✅
  6. MinIO Object Browser > /assets/unversioned/_docId_/meta.json exists ✅

B

  1. Any document > Settings > Template mode > set Template
  2. Home > Open this template document
  3. Share > Save a Copy
  4. Document History > Snapshots: No snapshots appear at all ❌
  5. After some time, Document History > Snapshots shows "Snapshots are unavailable. Only owners have access to snapshots for documents with access rules."
  6. MinIO Object Browser > /_docId_.grist file with multiple versions ✅
  7. MinIO Object Browser > /assets/unversioned/_docId_/meta.json does not exist ❌

C

  1. Import .grist file (any kind: with data, with history, or without, into a Team site or a @Personal site)
  2. Change some cells
  3. Exit to home … Open document
  4. Document History > Snapshots: No snapshots appear at all ❌
  5. Sometimes, after ~ 10s elapsed, error message pop-up: Unexpected error > can't access property "callback", t is null
  6. Other times, after ~ 10s elapsed, error message pop-up: Unexpected error > operation failed to become consistent: versions - 1,not ready,3005,not ready,8111,not ready,16792,not ready
  7. MinIO Object Browser > /_docId_.grist file with multiple versions ✅
  8. MinIO Object Browser > /assets/unversioned/_docId_/meta.json does not exist ❌

Tried the following on an imported document, but none starts snapshotting :

  • Settings > Reload Data Engine
  • Share > Duplicate Document / Work on a Copy
  • Share > Download … Home > Add New > Import Document
  • MinIO Object Browser > Create /assets/unversioned/_docId_/meta.json (content []) : makes Document History > Snapshots stay indefinitely empty

Describe the expected behavior

S3-based snapshotting should work on imported documents from the moment of importing.

According to https://community.getgrist.com/t/resynchronizing-snapshots-with-s3-versions/3220 meta.json should be recreated on the fly? Does not happen. (Makes me notice /apiconsole is missing the /api/docs/_docId_/snapshots/* endpoint.)

Trying the same on https://docs.getgrist.com/ seems not to have this issue.

Where have you encountered this bug?

Instance information (when self-hosting only)

  • Grist instance:
    • Version: 1.4.0
    • Installation mode: docker-compose.yml
      • redis:alpine
      • minio/minio
      • gristlabs/grist:latest some env vars which might be relevant:
      GRIST_ORG_IN_PATH: true
      GRIST_FORCE_LOGIN: false
      GRIST_ANON_PLAYGROUND: false
      GRIST_DOCS_MINIO_PREFIX: ""
      GRIST_DOCS_MINIO_ENDPOINT: minio
      GRIST_DOCS_MINIO_USE_SSL: 0
      GRIST_DOCS_MINIO_PORT: 9000
  • Architecture: single-worker, Grist>MinIO/Redis over Docker default network, not proxied to internet

  • Admin Panel > Self Checks: all ✅

  • Browser name, version and platforms on which you could reproduce the bug: Edge, Firefox

  • Browser console log if relevant:

  • Server log if relevant:

When opening Document History > Snapshots, after ~ 10s elapsed:

error: operation failed to become consistent: versions - 1,not ready,3003,not ready,8105,not ready,16778,not ready
warn: Error during api call to /docs/_docId_/snapshots: operation failed to become consistent: versions - 1,not ready,3003,not ready,8105,not ready,16778,not ready path=/docs/_docId_/snapshots, userId=5, altSessionId=_xxx_, , 

At import & sometimes, when opening Document History > Snapshots, after ~ 10s elapsed:

error: operation failed to become consistent: versions - 2,not ready,3005,not ready,8112,not ready,16790,not ready
error: HostedStorageManager error pushing _docId_ (1): Error: operation failed to become consistent: versions - 2,not ready,3005,not ready,8112,not ready,16790,not ready
    at ChecksummedExternalStorage._retry (/grist/_build/app/server/lib/ExternalStorage.js:305:15)
    at async DocSnapshotInventory._reconstruct (/grist/_build/app/server/lib/DocSnapshots.js:287:27)
    at async DocSnapshotInventory._getSnapshots (/grist/_build/app/server/lib/DocSnapshots.js:253:20)
    at async /grist/_build/app/server/lib/DocSnapshots.js:173:31
    at async KeyedMutex.runExclusive (/grist/_build/app/common/KeyedMutex.js:32:20)
    at async DocSnapshotInventory.uploadAndAdd (/grist/_build/app/server/lib/DocSnapshots.js:167:9)
    at async HostedStorageManager._pushToS3 (/grist/_build/app/server/lib/HostedStorageManager.js:717:13) docId=null

raph-topo avatar Feb 17 '25 11:02 raph-topo

Some hours later, one of the concerned docs starts showing snapshots. Maybe my manual fiddling with /assets/unversioned/_docId_/meta.json (creation empty + deletion some time later) help unblock some process?

raph-topo avatar Feb 17 '25 13:02 raph-topo

Hmm none of this rings a bell @raph-topo. Do you happen to know if these problems existed for you prior to 1.4.0 or is that the first version you tried?

paulfitz avatar Feb 17 '25 18:02 paulfitz

    at async DocSnapshotInventory.uploadAndAdd (/grist/_build/app/server/lib/DocSnapshots.js:167:9)
    at async HostedStorageManager._pushToS3 (/grist/_build/app/server/lib/HostedStorageManager.js:717:13) docId=null

The docId=null is a little surprising to me there, cc'ing @Spoffy in case it rings any bells.

paulfitz avatar Feb 17 '25 18:02 paulfitz

Just tried running Grist locally with minio+redis and things feel normal? Tried replicating B but things worked for me. The description of B sounds almost like an auth problem. Also the operation failed to become consistent points maybe towards redis. A redis issue could also cause an auth issue. Is your redis configuration good? You could try temporarily removing redis just to rule it out.

paulfitz avatar Feb 17 '25 18:02 paulfitz

The issue existed with 1.3.3 as well. I did not use earlier versions.

Well Redis is being populated a lot with all kinds of keys with grist docs uuids.

In this sort of "manually-fixed" doc, upon replacing a doc from a snapshot:

Unexpected error
MinIOExternalStorage.head did not get expected fields

And from then on all snapshots have disappeared, and Document History > Snapshots shows: "Snapshots are unavailable. Only owners have access to snapshots for documents with access rules." ❌

MinIO Object Browser > /assets/unversioned/_docId_/meta.json does exist but last modified shows it is not accessed anymore ❌

Hmm, upon a restart of all containers (Grist now with version 1.4.2), snaphots do show up for that one "manually-fixed" doc from scenario C & /assets/unversioned/_docId_/meta.json gets updates (Redis still ok) ✅

The issue does persist as scenario B still results in ❌

info: ext doc upload: minio://btp-gt//_docId_.grist checksum … version …
info: heartbeat email=…, userId=5, age=8, org=…, altSessionId=…, clientId=…, counter=8, url=https://…/o/…/…/copy, docId=…
error: operation failed to become consistent: versions - 4,not ready,3009,not ready,8117,not ready,16794,not ready
error: HostedStorageManager error pushing _docId_ (1): Error: operation failed to become consistent: versions - 4,not ready,3009,not ready,8117,not ready,16794,not ready
    at ChecksummedExternalStorage._retry (/grist/_build/app/server/lib/ExternalStorage.js:305:15)
    at async DocSnapshotInventory._reconstruct (/grist/_build/app/server/lib/DocSnapshots.js:287:27)
    at async DocSnapshotInventory._getSnapshots (/grist/_build/app/server/lib/DocSnapshots.js:253:20)
    at async /grist/_build/app/server/lib/DocSnapshots.js:173:31
    at async KeyedMutex.runExclusive (/grist/_build/app/common/KeyedMutex.js:32:20)
    at async DocSnapshotInventory.uploadAndAdd (/grist/_build/app/server/lib/DocSnapshots.js:167:9)
    at async HostedStorageManager._pushToS3 (/grist/_build/app/server/lib/HostedStorageManager.js:717:13) docId=null

raph-topo avatar Feb 19 '25 15:02 raph-topo

Trying with Redis removed from the stack results in this when editing the doc ❌

info: ext doc upload: minio://btp-gt//_docId_.grist checksum … version …
error: HostedStorageManager error pushing _docId_ (2): Error: MinIOExternalStorage.head did not get expected fields
    at MinIOExternalStorage.head (/grist/_build/app/server/lib/MinIOExternalStorage.js:59:23)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async MinIOExternalStorage.exists (/grist/_build/app/server/lib/MinIOExternalStorage.js:52:24)
    at async /grist/_build/app/server/lib/ExternalStorage.js:315:30
    at async ChecksummedExternalStorage._retry (/grist/_build/app/server/lib/ExternalStorage.js:283:32)
    at async DocSnapshotInventory._getSnapshots (/grist/_build/app/server/lib/DocSnapshots.js:243:22)
    at async /grist/_build/app/server/lib/DocSnapshots.js:173:31
    at async KeyedMutex.runExclusive (/grist/_build/app/common/KeyedMutex.js:32:20)
    at async DocSnapshotInventory.uploadAndAdd (/grist/_build/app/server/lib/DocSnapshots.js:167:9)
    at async HostedStorageManager._pushToS3 (/grist/_build/app/server/lib/HostedStorageManager.js:717:13) docId=null

Still that docId=null present…

on some other docs as well, including those that did not have an issue before.

Re-enabling Redis does not fix it.

raph-topo avatar Feb 19 '25 15:02 raph-topo

Something odd about this. @fflorent @georgegevoian @Spoffy are any of these symptoms familiar?

The only thing that catches my eye is minio://btp-gt//_docId_.grist, I presume the _docId_ part is redaction, but is that second // real or also redaction? Maybe there is a prefix problem? What if you try a prefix like v2/?

paulfitz avatar Feb 19 '25 15:02 paulfitz

Yes, I redacted with docId or …

Indeed, the // is not from me.

As mentioned, I did set the empty string GRIST_DOCS_MINIO_PREFIX: "" (MinIO path being _bucket_/assets/unversioned/_docId_/meta.json)

Maybe some parts of the code accept an empty prefix, others not?

I tried setting GRIST_DOCS_MINIO_PREFIX: docs, but it didn't solve (I guess I'd have to manually move the objects in MinIO?)

I do confirm that update+restart Grist container fixes some docs, but restart at same version does not. Does that help pinpoint?

raph-topo avatar Feb 19 '25 15:02 raph-topo

I tried setting GRIST_DOCS_MINIO_PREFIX: docs, but it didn't solve (I guess I'd have to manually move the objects in MinIO?)

Hi @raph-topo.

~Could you try not setting GRIST_DOCS_MINIO_PREFIX in your environment, and see if it still happens? (It will default to docs/ in that case.)~ Saw your edit. Hmm...

georgegevoian avatar Feb 19 '25 15:02 georgegevoian

Yes, I do get

info: == grist.externalStorage.minio.url: minio://btp-gt/docs/

But the extra path does not appear in MinIO, working docs still get snapshots, and non-working still don't.

Ah, upon creating a new doc, the path appears in MinIO and scenario B & C are fixed!

raph-topo avatar Feb 19 '25 15:02 raph-topo

Is Redis caching MinIO access paths?

I thought Grist was loading docs from MinIO, but all content etc. does continue to appear despite the change in paths.

raph-topo avatar Feb 19 '25 15:02 raph-topo

Okay so we have found the issue (empty GRIST_DOCS_MINIO_PREFIX… blame me for counter-productive KISS…), but not the reason why it only bugs part of the time.

raph-topo avatar Feb 19 '25 15:02 raph-topo

Could you try not setting GRIST_DOCS_MINIO_PREFIX in your environment, and see if it still happens? (It will default to docs/ in that case.)

That's the reason indeed: Grist does not support empty GRIST_DOCS_MINIO_PREFIX

Unsetting and moving all objects in MinIO into _bucket_/docs/… solves all issues (& some old snapshots are back).

Two options from here:

  • [ ] prevent empty GRIST_DOCS_MINIO_PREFIX entirely
  • [ ] but does Grist really support any other prefix than default?
  • [ ] find where it bugs with empty GRIST_DOCS_MINIO_PREFIX

raph-topo avatar Feb 19 '25 16:02 raph-topo

I tried to reproduce your issue by using this docker-compose.yml example file here: https://github.com/gristlabs/grist-core/blob/f8c357a6bfef3de1e3bf78cf6a3f9dd9e939efaf/docker-compose-examples/grist-with-postgres-redis-minio/docker-compose.yml

Which is probably quite similar to the environment where the bug has been observed.

And applying the below patch:

diff --git a/docker-compose-examples/grist-with-postgres-redis-minio/docker-compose.yml b/docker-compose-examples/grist-with-postgres-redis-minio/docker-compose.yml
index 1202fd27..57182855 100644
--- a/docker-compose-examples/grist-with-postgres-redis-minio/docker-compose.yml
+++ b/docker-compose-examples/grist-with-postgres-redis-minio/docker-compose.yml
@@ -21,6 +21,7 @@ services:
       GRIST_DOCS_MINIO_BUCKET: grist-docs
       GRIST_DOCS_MINIO_ENDPOINT: grist-minio
       GRIST_DOCS_MINIO_PORT: 9000
+      GRIST_DOCS_MINIO_PREFIX: ""
 
     volumes:
       # Where to store persistent data, such as documents.
@@ -49,6 +50,8 @@ services:
 
   grist-minio:
     image: minio/minio:latest
+    ports:
+      - 9001:9001
     environment:
       MINIO_ROOT_USER: grist
       MINIO_ROOT_PASSWORD: ${MINIO_PASSWORD}

(the ports for grist-minio are added to get access to the minio admin console)

From what I have seen so far, everything works as expected. But I must have missed details in the above conversation, I miss time for today, I need to read it carefully a bit later.

If in the meantime @raph-topo you successfully reproduce the issue with this docker-compose example, please let us know, it would be really helpful for us to investigate and bring a fix.

And thank you for the report BTW!

fflorent avatar Feb 19 '25 17:02 fflorent

FYI, I could reproduce the issue with the above patch, that's indeed surprising 👀

fflorent avatar Feb 20 '25 08:02 fflorent

@fflorent thanks for looking into this. I haven't had a chance to look closely myself, but the first thing I'd be checking is if a prefix set explicitly to "" results in paths with a // in them, which could easily misbehave. Separately, it is possible that prefixes that don't end in / might have ugly results. However I'm just guessing based on the fact that there's been no "user-friendliness" work done on snapshot configuration.

paulfitz avatar Feb 20 '25 13:02 paulfitz

FYI, I could reproduce the issue with the above patch, that's indeed surprising 👀

Great, so we are certain about the origin.

Some more experiments

absent GRIST_DOCS_MINIO_PREFIX -> ✅

info: == grist.externalStorage.minio.prefix: docs/ [default] [GRIST_DOCS_MINIO_PREFIX]
info: == grist.externalStorage.minio.url: minio://_bucket_/docs/

empty GRIST_DOCS_MINIO_PREFIX: -> ✅

info: == grist.externalStorage.minio.prefix: docs/ [default] [GRIST_DOCS_MINIO_PREFIX]
info: == grist.externalStorage.minio.url: minio://_bucket_/docs/

empty string GRIST_DOCS_MINIO_PREFIX: "" -> bug ❌

info: == grist.externalStorage.minio.prefix:  [GRIST_DOCS_MINIO_PREFIX]
info: == grist.externalStorage.minio.url: minio://_bucket_/

no ending slash GRIST_DOCS_MINIO_PREFIX: docs -> quick check, URLs ok despite missing / ✅

info: == grist.externalStorage.minio.prefix: docs [GRIST_DOCS_MINIO_PREFIX]
info: == grist.externalStorage.minio.url: minio://_bucket_/docs
info: ext meta upload: minio://_bucket_/docs/assets/unversioned/_docId_/meta.json checksum … version …

raph-topo avatar Feb 20 '25 15:02 raph-topo

Hi !

We also have the MinIOExternalStorage.head did not get expected fields error but GRIST_DOCS_MINIO_PREFIX isn't set.

I observed 2 weird things :

  1. One of the versions that MinIO returns has a null versionId even though it's not a delete marker
[2025-05-23 15:26:02 CEST]  69MiB STANDARD null v24 PUT 2gWdJogzQBsMtctuyertYR.grist
  1. The metadata looks wrong
Metadata  :
  X-Amz-Meta-Metadata: [object Object]
  Content-Type       : binary/octet-stream

We're on v1.4.2 .

@fflorent you can ping me on matrix if you want to take a look at our prod data.

hrenard avatar Jun 27 '25 11:06 hrenard

@hrenard Hi!

What you see looks more like a bug on MinIO than on our end. What's your version of the MinIO server?

The metadata looks wrong

We also have this issue, but it is unrelated to the bug report here, probably not worth to be worry about (though it should be quite easy to fix it).

fflorent avatar Jun 27 '25 13:06 fflorent