server icon indicating copy to clipboard operation
server copied to clipboard

complete config reset with filled disk

Open noerw opened this issue 4 years ago • 17 comments

Steps to reproduce

  1. Fill entire available diskspace
  2. Visit instance landingpage (might need to repeat this step a couple of times, not sure if instantly reproducible)

Expected behaviour

Config should never be reset, even in edgecases. Config should never ever be opened with write-permission from a public / non-admin endpoint.

Actual behaviour

Find your config/config.php truncated to a minimal config file with only the instanceid key filled in. This is serious.

Landing page says something like "Looks like you're trying to reinstall nextcloud but file CAN_INSTALL was not found"

Server configuration

Operating system: debian 9

Web server: nginx

Database: mysql

PHP version: 7.3

Nextcloud version: (see Nextcloud admin page) 17.0.1

Updated from an older Nextcloud/ownCloud or fresh install: updated since NC 15

Where did you install Nextcloud from: source tarball

Signing status:

Signing status
No errors have been found.

List of activated apps:

App list
Enabled:
  - accessibility: 1.3.0
  - apporder: 0.8.0
  - audioplayer: 2.8.4
  - calendar: 1.7.1
  - cloud_federation_api: 1.0.0
  - contacts: 3.1.6
  - dav: 1.13.0
  - deck: 0.7.0
  - encryption: 2.5.0
  - federatedfilesharing: 1.7.0
  - files: 1.12.0
  - files_linkeditor: 1.0.11
  - files_markdown: 2.1.0
  - files_pdfviewer: 1.6.0
  - files_readmemd: 1.1.3
  - files_rightclick: 0.15.1
  - files_sharing: 1.9.0
  - files_videoplayer: 1.6.0
  - gallery: 18.4.0
  - logreader: 2.2.0
  - lookup_server_connector: 1.5.0
  - nextcloud_announcements: 1.6.0
  - notes: 3.0.3
  - notifications: 2.5.0
  - oauth2: 1.5.0
  - password_policy: 1.7.0
  - polls: 0.10.4
  - privacy: 1.1.0
  - provisioning_api: 1.7.0
  - recommendations: 0.5.0
  - serverinfo: 1.7.0
  - sharebymail: 1.7.0
  - sharerenamer: 2.7.2
  - survey_client: 1.5.0
  - tasks: 0.11.3
  - text: 1.1.1
  - theming: 1.8.0
  - twofactor_backupcodes: 1.6.0
  - updatenotification: 1.7.0
  - viewer: 1.2.0
  - workflowengine: 1.7.0

Nextcloud configuration:

Config report
{
    "system": {
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "***REMOVED SENSITIVE VALUE***"",
            "***REMOVED SENSITIVE VALUE***"",
            "***REMOVED SENSITIVE VALUE***""
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "17.0.1.1",
        "overwrite.cli.url": "***REMOVED SENSITIVE VALUE***",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "logtimezone": "UTC",
        "installed": true,
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpmode": "smtp",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "appstore.experimental.enabled": true,
        "loglevel": 2,
        "maintenance": false,
        "default_language": "de",
        "theme": "",
        "updater.release.channel": "stable",
        "mail_smtpauthtype": "LOGIN",
        "mail_smtpauth": 1,
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "587",
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpsecure": "tls",
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "memcache.local": "\\OC\\Memcache\\APCu",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379,
            "timeout": 0,
            "dbindex": 0
        },
        "mysql.utf8mb4": true
    }
}

Are you using external storage, if yes which one: no Are you using encryption: yes Are you using an external user-backend, if yes which one: no

Client configuration

Browser: latest firefox

Operating system: linux

Logs

Web server error log

Web server error log
Insert your webserver log here

Nextcloud log (data/nextcloud.log)

Nextcloud log At the point of config-reset, nothing is logged. Beforehand, redis complains about unavailable storage

noerw avatar Dec 31 '19 16:12 noerw

When you say "entire disk space" are you talking about server space (can't write anything) or data folder only?

solracsf avatar Jan 01 '20 13:01 solracsf

All available system storage full, though I have both on the same file system

noerw avatar Jan 01 '20 15:01 noerw

Being able to reproduce here.

  • Mounted a WebDAV share trough davfs2
  • Installed NC 17.0.2 and place data-dir under /mnt/davfs2/data
  • Uploaded some files
  • Shut down davfs2 mount (so emulating system storage "full")
  • Try to access Nexcloud URL
  • Message about CAN_INSTALL not on the config folder
  • Check config.php:
cat config.php

<?php
$CONFIG = array (
  'instanceid' => 'ocd1zu1a7a',
); 

solracsf avatar Jan 02 '20 11:01 solracsf

Is this Issue still valid? If not, please close this issue. Thanks! :)

szaimen avatar May 28 '21 09:05 szaimen

This issue has been automatically marked as stale because it has not had recent activity and seems to be missing some essential information. It will be closed if no further activity occurs. Thank you for your contributions.

ghost avatar Jun 27 '21 09:06 ghost

Gasp. Happened to me right now. Back from vacation. the disk was filled and my nextcloud is dead, saying : "it looks like you are trying to reinstall nextcloud but CAN_INSTALL is absent" my config.php is now 62 bytes and contains only the instanceid

I don't have a backup of this config file !! How can I get it back ?

BENETNATH avatar Apr 25 '22 10:04 BENETNATH

Can we please reopen this? How can such a critical failure mode (DoS + potential data loss) be missed for years? How can the report on such a problem be closed just in case, because *waves hands* magic could have solved this in the meantime?

(To clarify, I'm not angry at anyone personally; from my own work I know well enough that important tickets can slip. Just want to draw attention to a triaging fail like this, so processes can be improved).

I don't have a backup of this config file !! How can I get it back ?

@BENETNATH You can try and see if $nc-data/updater-*/backups/nextcloud-*/config/config.php exists. If not, I guess you're out of luck.

noerw avatar Apr 25 '22 11:04 noerw

@BENETNATH don't want to hate here but how can you self host any service without creating a regular backup?

szaimen avatar Apr 25 '22 11:04 szaimen

I guess this would be a possible solution to this problem: https://github.com/nextcloud/server/pull/27492

szaimen avatar Apr 25 '22 11:04 szaimen

@noerw works ! Meanwhile, i was back on track with another nextcloud user created to reach the DB Thanks !

@szaimen backup is done on all data, not the config.php that was not supposed to crash like that :)

BENETNATH avatar Apr 25 '22 11:04 BENETNATH

@BENETNATH if you didn't know: a Nextcloud backup must also include the Nextcloud files including the config.php: https://docs.nextcloud.com/server/latest/admin_manual/maintenance/backup.html?highlight=backup#backup-folders

szaimen avatar Apr 25 '22 11:04 szaimen

I don't see how #27492 would address this problem. I'm not familiar with the code base, so I don't know about the reasons for this behavior, but the underlying problems I see are:

  • [x] config file is opened in read/write mode, even for unauthenticated page loads.

    • Changing this to read-only (except for the admin settings pages) would stop the config reset from happening in most cases.
    • config.php may not be the only file that's rewritten on page load, this ideally would be enforced on as many files as possible.
  • [x] it needs to be evaluated why the config file only contains minimal data (instance ID only), but is not truncated (as i would expect with broken filesystem writes). This points at a deeper problem with the config handling, or rather that a bad state is triggered through a full disk without being catched before writing the config file.

noerw avatar Apr 25 '22 11:04 noerw

cc @PVince81

szaimen avatar Apr 25 '22 11:04 szaimen

This just happened to me. I think this bug should be escalated. It is high severity. Loosing the config file can mean hours of work especially if it has been some time after you set up Nextcloud. And first you need to figure out that your config file has been shredded.

If find it very unsual that the config file is even opened in r/w mode and written to in the first place..

There is also a bunch of duplicates: https://github.com/nextcloud/server/issues/31869 https://github.com/nextcloud/server/issues/25175 https://github.com/nextcloud/server/issues/27377 https://github.com/nextcloud/server/issues/18973 https://github.com/nextcloud/server/issues/19829

tzugen avatar Jun 13 '22 07:06 tzugen

idea: when writing config.php perhaps it first needs to be written into a part file "config.php.part" to make sure that the file was fully writable (and not truncated). and if there was no error, do a rename+overwrite onto config.php

PVince81 avatar Jun 13 '22 07:06 PVince81

@PVince81 I agree that an atomic write would be the way to go.

Heres some more issues on the forums, with people losing even access to their encrypted files:

  • https://help.nextcloud.com/t/sudden-error-it-looks-like-you-are-trying-to-reinstall-your-nextcloud-however-the-file-can-install-is-missing-from-your-config-directory-please-create-the-file-can-install-in-your-config-folder-to-continue/68897/4
  • https://help.nextcloud.com/t/recover-encrypted-files-after-lost-config-php/73297/9

tzugen avatar Jun 13 '22 07:06 tzugen

idea: when writing config.php perhaps it first needs to be written into a part file "config.php.part" to make sure that the file was fully writable (and not truncated). and if there was no error, do a rename+overwrite onto config.php

This sounds good. Might still want to add a nonce to the name. Otherwise two write operations could still create a race condition:

  • request one opens config.php.part for writing and writes it's content
  • request two opens config.php.part for writing
  • request one copies the now empty file
  • request three reads the now empty file
  • request two actually writes to config.php.part
  • request two copies its final config.php.part
  • request three starts with an empty config and initializes it with instanceid
  • request three writes its config to disk via config.php.part.

This is much less likely to happen - but i think including a random string or the request id or so in the file name would avoid it entirely.

max-nextcloud avatar Aug 09 '22 07:08 max-nextcloud

  • [ ] config file is opened in read/write mode, even for unauthenticated page loads.

    • Changing this to read-only (except for the admin settings pages) would stop the config reset from happening in most cases.
    • config.php may not be the only file that's rewritten on page load, this ideally would be enforced on as many files as possible.

Config file is only opened in write mode if data needs to be written, meaning a configuration var was modified and config_is_read_only is false. I am not sure why you would have a config write on an unauthenticated page load.

  • [ ] it needs to be evaluated why the config file only contains minimal data (instance ID only), but is not truncated (as i would expect with broken filesystem writes). This points at a deeper problem with the config handling, or rather that a bad state is triggered through a full disk without being catched before writing the config file.

What I can see is that if reading the configuration file fails, it will still write the configuration data if a var is set, meaning it will write an empty configuration object. An easy way to protect against this is to add a check in Config::writeData that at least version and installed are set for instance.

I was unable to reproduce the problem on master using a small tmpfs as config directory and filling it. Does the data directory needs to be full to trigger the problem?

come-nc avatar Sep 06 '22 10:09 come-nc

It is the call to OC_Util::getInstanceId() that will generate and write a new instance id if missing, and write a configuration file with only this if configuration was not loaded correctly.

come-nc avatar Sep 06 '22 10:09 come-nc

@come-nc thanks for looking into this! :tada:

I was unable to reproduce the problem on master using a small tmpfs as config directory and filling it. Does the data directory needs to be full to trigger the problem?

Yes I think so, though I didn't reproduce for a while myself. https://github.com/nextcloud/server/issues/18620#issuecomment-570187308 mentions that having the data dir full / unavailable made it reproducable

noerw avatar Sep 08 '22 22:09 noerw