server icon indicating copy to clipboard operation
server copied to clipboard

Check return value and improve error handling on certificate manager

Open Messj1 opened this issue 3 years ago • 18 comments

close #33487

With S3 primary storage there was a problem with getting the CA bundle from the storage without having the CA bundle for the connection which causes that the CertificateManager was throwing an Error. (handled in https://github.com/nextcloud/server/commit/bffa67c48beced2147af196a5b63414c113aaad4) This commit improves the handling in CertificateManager and log unexpected behaviors.

this issue is releated to #31605 so it should also be backported to Nextcloud v22

Messj1 avatar Nov 11 '22 02:11 Messj1

/backport to stable25

szaimen avatar Nov 12 '22 01:11 szaimen

/backport to stable24

szaimen avatar Nov 12 '22 01:11 szaimen

/backport to stable23

szaimen avatar Nov 12 '22 01:11 szaimen

Hi and thanks for your pull request :+1:

I would prefer not to use Throwable here. getLocalFile may return false and your suggestion to handle this case properly is valid.

Would you mind updating the PR to throw an exception (one from https://www.php.net/manual/en/spl.exceptions.php should do it) when bundlePath is false.

kesselb avatar Nov 30 '22 10:11 kesselb

@kesselb I don't see the reasons to make those changes since they don't improve or fix something. Maybe unofficial guideline? The developer manual itself has contains no chapter how to handle exception only how to throw them.

There are 2 change requests:

  1. Exception or Throwable

I would prefer not to use Throwable here. getLocalFile may return false and your suggestion to handle this case properly is valid.

Reason?

I think it is more robust if we left the Throwable since it doesn't matter if there is an Exception or an Error. There is a fallback which work in both cases. The goal should be to give the application a chance to finish the work since there is a fallback. This was the root cause which lead to the issue #33487
2) Throwing an Exception

Would you mind updating the PR to throw an exception (one from https://www.php.net/manual/en/spl.exceptions.php should do it) when bundlePath is false.

Your Idea is to throw an Exception which gets catched in the same function?

				if ($bundlePath === false) {
+					throw(new \RuntimeException("Unable to get certificate bundle '" . $certificateBundle . "'. Fallback to default ca-bundle.crt"));
-					$this->logger->warning("Unable to get certificate bundle '" . $certificateBundle . "'. Fallback to default ca-bundle.crt");
-					return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
				}
				$this->bundlePath = $bundlePath;
			}
			return $this->bundlePath;
		} catch (\Throwable $e) {
			$this->logger->error("Error occurred during fetch certificate bundle. Fallback to default ca-bundle.crt", ['exception' => $e]);
			return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
		}

Reason?

The idea was to handle the "exception" directly since it is a known return value. IMHO this it is not directly an error since the source of this exception is unknown and it could be something simple like a poor connection. So the difference is that waning is used instead of an error. The error should occur in the underlying function.

https://www.php-fig.org/psr/psr-3/

error Runtime errors that do not require immediate action but should typically be logged and monitored. warning Exceptional occurrences that are not errors. Example: Use of deprecated APIs, poor use of an API, undesirable things that are not necessarily wrong.

Messj1 avatar Dec 02 '22 22:12 Messj1

I think it is more robust if we left the Throwable since it doesn't matter if there is an Exception or an Error. There is a fallback which work in both cases.

https://stackoverflow.com/questions/6083248/is-it-a-bad-practice-to-catch-throwable is a good read.

You need to be as specific as possible. Otherwise unforeseen bugs might creep away this way. Besides, Throwable covers Error as well and that's usually no point of return. You don't want to catch/handle that, you want your program to die immediately so that you can fix it properly.

The actual bug, that we are not handling getLocalFile properly, would be much harder to spot with catch throwable.

Your Idea is to throw an Exception which gets catched in the same function?

I find it easier to read when the error handling is in one place, especially for the same error handling. But that's not a hill I want to die on.

kesselb avatar Dec 03 '22 20:12 kesselb

https://stackoverflow.com/questions/6083248/is-it-a-bad-practice-to-catch-throwable is a good read.

You need to be as specific as possible. Otherwise unforeseen bugs might creep away this way. Besides, Throwable covers Error as well and that's usually no point of return. You don't want to catch/handle that, you want your program to die immediately so that you can fix it properly.

First of all. It's an opinion to let the script die. Then the end user see an error. => But this behavior seems not to be user friendly. IMHO since there is a legal fallback, the important thing is to inform the administrator with a hint in the log file. (it is the main reason to have a log file ;)

The actual bug, that we are not handling getLocalFile properly, would be much harder to spot with catch throwable.

Yes, the bug was occurred cause of incorrect handling. And also no, the issue was created cause of the missing error/fallback handling.

:+1: Will change it. But to be clear, there is no hidden information at all (since it get logged) and the use of Throwable is legal to use since it was developed for such purposes.

Your Idea is to throw an Exception which gets catched in the same function?

I find it easier to read when the error handling is in one place, especially for the same error handling. But that's not a hill I want to die on.

Thought the same as I started to write the code. Then I asked myself: Is this an error since it is legal to have false as return value?

@kesselb Simple question: Is this an error or warning?

  • [ ] Warning: Code stay as is
  • [ ] Error: return points can be merged (throw error)

Messj1 avatar Dec 03 '22 23:12 Messj1

There is another unhandled case with null as return value from getLocalFile.

View.php

	public function getLocalFile($path) {
		...
		if (Filesystem::isValidPath($parent) and $storage) {
			return $storage->getLocalFile($internalPath);
		} else {
			return null;
		}
	}

@kesselb Should this lead to fallback? Currently the function returns null.

So the new if condition would be something like this:

+				if ($bundlePath === null || $bundlePath === false) {
-				if ($bundlePath === false) {

Messj1 avatar Dec 03 '22 23:12 Messj1

One thing I would change is the logic with has Certificates. Currently if there are no certificates it caches the "system" ca bundle. And then it tries to write it in the storage. Is there any reason for that behavior since the files are then the same.

The try .. catch statement could also be changed inside the if block since it isn't needed to check a variable. And so we don't need it if the file is cached.

public function getAbsoluteBundlePath(): string {
-	try {
	if ($this->bundlePath === null) {
+		try {
-			if (!$this->hasCertificates()) {
-				$this->bundlePath = \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
-			}
+			if ($this->hasCertificates()) {

			if ($this->needsRebundling()) {
				$this->createCertificateBundle();
			}

			$certificateBundle = $this->getCertificateBundle();
			$bundlePath = $this->view->getLocalFile($certificateBundle);
			if ($bundlePath === null) {
				throw(new \RuntimeException("Unable to get certificate bundle cause of invalid path or nonexistent storage '" . $certificateBundle . "'."));
			} else if ($bundlePath === false) {
				throw(new \RuntimeException("Unable to get certificate bundle '" . $certificateBundle . "'."));
			}
			$this->bundlePath = $bundlePath;
+			} else {
+				$this->bundlePath = \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
+			}
-		}
-		return $this->bundlePath;
		} catch (\Exception $e) {
			$this->logger->error("Error occurred during fetch certificate bundle. Fallback to default ca-bundle.crt", ['exception' => $e]);
			return \OC::$SERVERROOT . '/resources/config/ca-bundle.crt';
		}
+	}
+	return $this->bundlePath;
}

Another Task is to handle the following error since it isn't honest to return an invalid fallback.

	public function createCertificateBundle(): void {
		...
		$defaultCertificates = file_get_contents(\OC::$SERVERROOT . '/resources/config/ca-bundle.crt');
		if (strlen($defaultCertificates) < 1024) { // sanity check to verify that we have some content for our bundle
			// log as exception so we have a stacktrace
			$e = new \Exception('Shipped ca-bundle is empty, refusing to create certificate bundle');
			$this->logger->error($e->getMessage(), ['exception' => $e]);
			return;
		}

Last but not least there are plenty of incomplete phpdoc for Example in the View::getLocalFile

-	 * @return string
+	 * @return string|null|false
+	 * @throws InvalidPathException
+	 * @throws NotFoundException if no mount for path existing
+	 * @throws HintException
+	 * @throws ServerNotAvailableException
+	 * @throws \Exception if no user home storage got configured
+	 * @throws \Exception if mount provider name exceeds the limit of 128 characters
+	 * @throws \Exception if the result from storage wrapper was invalid
+	 * @throws \Exception in general if storage got problem with plenty of concrete exceptions
+	 * @throws \Exception if the root storage could not be initialized

But I think those task aren't fit into this pull request.?

Messj1 avatar Dec 04 '22 23:12 Messj1

Thanks for your feedback :+1:

Some context why we ship a ca-bundle copy: https://github.com/nextcloud/server/pull/32963

Documentation for occ security:certificates command: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/occ_command.html#security

Simple question: Is this an error or warning?

I don't have an opinion here.

One thing I would change is the logic with has Certificates. Currently if there are no certificates it caches the "system" ca bundle. And then it tries to write it in the storage. Is there any reason for that behavior since the files are then the same.

It's possible to import certificates with occ security:certificates:import. Uploaded certificates are stored in $datadirectory/files_external/uploads. Uploaded certificates and default ca-bundle are merged to $datadirectory/files_external/rootcerts.crt.

https://github.com/nextcloud/server/blob/f95aa233560b4234ee14f975d49ff19ac989b40e/lib/private/Security/CertificateManager.php#L246-L248

I believe this block is mostly relevant for an update situation. Updater replaced the server files and we have a new /resources/config/ca-bundle.crt bundle. Upgrade (occ upgrade) will initialize the CertificateManager and trigger a rebuild (cf. https://github.com/nextcloud/server/issues/33487, stack trace in the first post).

https://github.com/nextcloud/server/blob/f95aa233560b4234ee14f975d49ff19ac989b40e/lib/private/Security/CertificateManager.php#L241-L252

You are right, this looks weird. hasCertificates() is true when a certificate was imported with occ security:certificates:import. It looks good to me to use the default path when hasCertificates() is false. hasCertificates() is expensive and we should cache the result. I guess it's a bug that we run needsRebundling without user certificates.

getCertificateBundle and getAbsoluteBundlePath can return a different bundle. Luckily getCertificateBundle() is not called externally and we should make the function private (in a different PR because that's an API change and not backportable).

Another Task is to handle the following error since it isn't honest to return an invalid fallback.

True. https://github.com/nextcloud/server/issues/2910#issuecomment-270618653 sounds unusual. I tend to ignore the related code block. There is not a single bug report for the error message on GitHub.

But I think those task aren't fit into this pull request.?

Yes. Should be done in another one.

kesselb avatar Dec 05 '22 22:12 kesselb

@kesselb So it is like it now is. An Error :smile:

The CI pipeline errors are not mine:

  • Still problem with cloning (this time the submodul)
  • Performance test is hanging on some crypto (openssl_seal) problem and uncleaned testing base (locking exceptions)

So there are following tasks left:

  1. Open issue to change the behavior as commented above.

I guess it's a bug that we run needsRebundling without user certificates.

I think the idea was to regenerate certificate bundle if it is missing. But is it nonsense since the getCertificateBundle() is not delivering the real path and has also no fallback in it. So in some race condition it will fail :scream:

Will open new Issue with that.

  1. Add deprecated warning for external call of getCertificateBundle()

getCertificateBundle and getAbsoluteBundlePath can return a different bundle. Luckily getCertificateBundle() is not called externally and we should make the function private (in a different PR because that's an API change and not backportable).

I think we should at least backport an deprecated warning. So user will be informed if they use this API. Any suggestions?

  1. Open issue to extend missing php docs

Messj1 avatar Dec 05 '22 23:12 Messj1

@kesselb I forgot to sign all my commits. They are not verified in github. Should I sign them with rebase and use force push to correct it?

Messj1 avatar Dec 06 '22 15:12 Messj1

I forgot to sign all my commits. They are not verified in github. Should I sign them with rebase and use force push to correct it?

Go ahead if you want to sign them but it's not necessary for us. We require Signed-off-by: for every commit message, and that's already done.

kesselb avatar Dec 06 '22 17:12 kesselb

Hi, please rebase onto master to pull the latest changes to fix some CI failures.

kesselb avatar Dec 07 '22 22:12 kesselb

@pserwylo What went wrong on performance testing? https://github.com/nextcloud/server/actions/runs/3644023086/jobs/6158316871

If works local on my computer without warnings and error if I try to reproduce the error. :unamused:

Messj1 avatar Jan 04 '23 00:01 Messj1

@Messj1 Performance testing is not a required check yet. It's fine to ignore it for now.

kesselb avatar Jan 04 '23 17:01 kesselb

/rebase

kesselb avatar Jan 04 '23 17:01 kesselb

rebased and solved merge conflict.

Messj1 avatar Apr 06 '23 21:04 Messj1

Thanks for your first pull request and welcome to the community! Feel free to keep them coming! If you are looking for issues to tackle then have a look at this selection: https://github.com/nextcloud/server/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22

welcome[bot] avatar May 04 '23 19:05 welcome[bot]

/backport to stable26

solracsf avatar May 04 '23 20:05 solracsf

Just wanted to report that I am still hitting an error while upgrading from 26.0.0 to 26.0.1:

Nextcloud or one of the apps require upgrade - only a limited number of commands are available
You may use your browser or the occ upgrade command to do the upgrade
Setting log level to debug
Turned on maintenance mode
Updating database schema
Updated database
An unhandled exception has been thrown:
TypeError: fwrite(): Argument #1 ($stream) must be of type resource, bool given in /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php:161
Stack trace:
#0 /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php(161): fwrite()
#1 /usr/share/webapps/nextcloud/lib/private/Security/CertificateManager.php(247): OC\Security\CertificateManager->createCertificateBundle()
#2 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(129): OC\Security\CertificateManager->getAbsoluteBundlePath()
#3 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(76): OC\Http\Client\Client->getCertBundle()
#4 /usr/share/webapps/nextcloud/lib/private/Http/Client/Client.php(226): OC\Http\Client\Client->buildRequestOptions()
#5 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/Fetcher.php(120): OC\Http\Client\Client->get()
#6 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/AppFetcher.php(86): OC\App\AppStore\Fetcher\Fetcher->fetch()
#7 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/Fetcher.php(190): OC\App\AppStore\Fetcher\AppFetcher->fetch()
#8 /usr/share/webapps/nextcloud/lib/private/App/AppStore/Fetcher/AppFetcher.php(187): OC\App\AppStore\Fetcher\Fetcher->get()
#9 /usr/share/webapps/nextcloud/lib/private/Installer.php(421): OC\App\AppStore\Fetcher\AppFetcher->get()
#10 /usr/share/webapps/nextcloud/lib/private/Updater.php(420): OC\Installer->isUpdateAvailable()
#11 /usr/share/webapps/nextcloud/lib/private/Updater.php(281): OC\Updater->upgradeAppStoreApps()
#12 /usr/share/webapps/nextcloud/lib/private/Updater.php(140): OC\Updater->doUpgrade()
#13 /usr/share/webapps/nextcloud/core/Command/Upgrade.php(225): OC\Updater->upgrade()
#14 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Command/Command.php(255): OC\Core\Command\Upgrade->execute()
#15 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(1009): Symfony\Component\Console\Command\Command->run()
#16 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(273): Symfony\Component\Console\Application->doRunCommand()
#17 /usr/share/webapps/nextcloud/3rdparty/symfony/console/Application.php(149): Symfony\Component\Console\Application->doRun()
#18 /usr/share/webapps/nextcloud/lib/private/Console/Application.php(215): Symfony\Component\Console\Application->run()
#19 /usr/share/webapps/nextcloud/console.php(100): OC\Console\Application->run()
#20 /usr/share/webapps/nextcloud/occ(11): require_once('...')
#21 {main}%

Applying the workaround from https://github.com/nextcloud/server/issues/33487#issuecomment-1273453676 fixed allowed the update to complete.


edit: I guess that is exxpected until https://github.com/nextcloud/server/pull/38091 finally gets merged. 😅

mschilli87 avatar May 11 '23 09:05 mschilli87

Just to follow up: The 26.0.1 -> 26.0.2 update was the first one since 24.0.5 to complete successfully without the need for any workarounds/hacks. Thanks to everybody contributing to fixing this.

mschilli87 avatar May 29 '23 08:05 mschilli87