flow-development-collection icon indicating copy to clipboard operation
flow-development-collection copied to clipboard

PersistentResource gets persisted if a attached validation fails

Open sorenmalling opened this issue 9 years ago • 13 comments

Description

If a model has a PersistentResource as a property and a custom validator attached, validating ex. the Media Type (allow only jpg and png) the PersistentResource is persisted and stored even though the validation fails.

Steps to Reproduce

  1. Create a model with a PersistentResource as property
  2. Create a validator that returns false
  3. Submit a form with a resource
  4. See in the log
16-12-09 11:35:14 67587      DEBUG     Flow                 Successfully imported the uploaded file "8196730941.webm" into the resource collection "persistent" (storage: "defaultPersistentResourcesStorage", a Neos\Flow\ResourceManagement\Storage\WritableFileSystemStorage. SHA1: f3fa2e994fb478dfc9eac52c5addc51de1f7fb38)
16-12-09 11:35:14 67587      ERROR     Flow                 Validation failed while trying to call Application\RestApi\Controller\ImageController->uploadAction().
Error for data:  The media type "video/webm" is not allowed for this image.

16-12-09 11:35:14 67587      DEBUG     Flow                 FileSystemSymlinkTarget: Published file. (target: localWebDirectoryPersistentResourcesTarget, file: f3fa2e994fb478dfc9eac52c5addc51de1f7fb38/8196730941.webm)

Expected behavior

As the validation of the property returns false, the whole object is considered "invalid" and therefore the resource should not be persisted.

As the import is managed inside the ResourceTypeConverter and it's applied before the validation, the resource is already persisted at that point.

It could be argumented that the validator could clean up, but that's not the task for a validator that only returns true/false.

Somehow a way to interact with the ResourceTypeConverter and be able to have validator(s) validate the input

Actual behavior

The resource is persisted

Affected Versions

Neos: *

Flow: *

sorenmalling avatar Dec 09 '16 13:12 sorenmalling

A idea of how validation of resources could be configured

	/**
	 * Initialize upload action
	 */
	public function initializeUploadAction()
	{
		if (isset($this->arguments['image'])) {
			/** @var PropertyMappingConfiguration $propertyMappingConfiguration */
			$propertyMappingConfiguration = $this->arguments['image']->getPropertyMappingConfiguration();
			$propertyMappingConfiguration->setTypeConverterOption(ResourceTypeConverter::class, 'validators', [
				ImageTypeValidator::class => [
					'allowedTypes' => 'jpeg,png'
				]
			]);
		}

	}

sorenmalling avatar Dec 09 '16 13:12 sorenmalling

Related to #770

kitsunet avatar Feb 07 '17 12:02 kitsunet

@kdambekalns @robertlemke this is a rather problematic behaviour (anyone can spam the server with files) and very core - is this something we should tackle maybe for 4.1 in an effort?

albe avatar Mar 23 '17 10:03 albe

Ok, so how do we proceed? I have a feeling this is pretty tricky to get right.

We could try to somehow postpone the handleFileUploads call until after validation. My first idea would be to move file upload logic into the ResourceRepository (its responsible for persisting Resources and storing the file on storage is part of that), then somehow trigger ResourceRepository->add() only after validation.

Edit: The following is a huge text of my thinking process going off-topic. Sorry for that. Feel free to ignore and directly jump to "END OF OFF-TOPIC" further below:

However, I think we have a generic issue with how we deal with Validation and Type Conversion/Property Mapping:

Currently we first convert raw input values through TypeConverters, which possibly fail or have side effects. After that, we try to Validate the outcoming result with Validators such as IntegerValidator or MediaTypeValidator, which then kick in too late to do anything useful. See also https://github.com/neos/flow-development-collection/issues/835 for a related issue.

I somehow feel we are mixing different kinds of validation. One is input value validation in order to reject invalid (and possibly malicious) input. The other is more domain validation, i.e. keeping some rules inside domain models intact. The latter is the one that necessarily needs to happen AFTER Type Conversion, because it can not work on raw data. The first however should only work on raw data and act as a kind of early filter prior to Type Conversion. Since those two are mixed, both are done after Type Conversion. So my gut feeling tells me, we should move (simple types') Validation before Type Conversion and then apply object/model validations in a second step. Of course this comes with a buckload of issues:

  • it is highly breaking if we get that done at all
  • it is not easy to separate some validators into "input" vs. "domain" validation (e.g. RegexValidator)
  • how do we make the separation explicit? We currently only have the concept of Validation Annotations, so if we only have a unique way of declaring validations it will always end up with mixing
  • it adds a lot of additional complexity to the validation system with two distinct validation chains per argument
  • we need to properly document and explain what the difference in the two validations is

Regarding the 3rd point I could imagine that we keep Annotations only for input validation and use the convention based model validators as domain validation, possibly extended with a way to convert exceptions thrown inside the domain model into validation errors (maybe inside the ObjectTypeConverter - in DDD there should be no invalid models instanciable at all).

Edit: Ok, some more random thoughts after a bit of talk on slack:

The "input value validation" I mention above, is actually a form of "Type" validation. So basically it makes sure that the string input we receive in PHP is in the proper format for a specific Type, like integer, float, DateTime, etc. That however, is the actual concern of the concrete TypeConverters - i.e. they contain the logic to accept a wide range of simple type input values and result in a single return type. For cases such as RegularExpressionValidator validating that some input only takes specific sets of values, the more correct thing would actually be to define a Data Type that only accepts values matching the regular expression and then write a Type Converter for that.

That leaves the job of the actual Validators to further restrict the allowed values inside the Type boundaries, e.g. DateTimeRange, NumberRange, BooleanValue, etc. which in turn leaves the question:

What are the actual use cases for IntegerValidator, FloatValidator, NumberValidator, DateTimeValidator, StringValidator?

The only thing I can think of is to have a string property and make sure that is in the format of an Integer/Float/Number/DateTime/String(?) - but those validators also allow the actual types, which would then break persistence (or if PHP7.x typehints are used). As it looks they only lead to confusion over what the Validators' job is actually - restrict input values into a specific range/set, NOT validate types!

So what does this mean for this case?

END OF OFF-TOPIC

We currently have the Type of PersistentResource, which we try to convert to. That of course implicates that the Resource is already persistent after conversion and hence Validating the media types afterwards is backwards. So there's the ways I see this working out:

  • add an ResourceUpload Type, which is not yet persistent and will somehow be transformed into a PersistentResource after validation (ugh)
  • make the actual persistence of a Resource happen lazily on shutdown, with a failing Validation preventing that (see also #770).
  • make Resources work like normal Domain Models (Aggregates) and shift responsibility to actually persist them to the developer (i.e. they have to call $this->resourceRepository->add($myEntity->image); for the uploaded file to be moved to storage and be persisted in database)

Any thoughts on this @kdambekalns @robertlemke @bwaidelich @kitsunet ?

albe avatar Jul 02 '17 19:07 albe

I'd most like option 2 in the list above… (make the actual persistence of a Resource happen lazily on shutdown, with a failing Validation preventing that.) The other options look like too heavy a change and/or impose work on everyone using this (and of the type we cannot easily code into a core migration…)

kdambekalns avatar Jul 17 '17 08:07 kdambekalns

On the off-topic part above…

  • Yes, for things like RegularExpressionValidator or EmailAddressValidator a proper (value) object ensuring correctness would be the way to go. Just back then we didn't (yet) know…).
  • For model properties the "scalar validators" still make sense, since no property mapping might be involved and reinventing validation would be a waste.

kdambekalns avatar Jul 17 '17 08:07 kdambekalns

@albe wrote in Slack:

Regarding "For model properties the "scalar validators" still make sense, since no property mapping might be involved" - I'm highly interested in where/when that would be the case. I think I'm missing something fundamental

Well, if you work with a model in PHP and, say, set some integer property to a string… then validation will complain when triggered during persistence. (Not needed if working with PHP7 type hints throughout the code, but …)

kdambekalns avatar Jul 17 '17 09:07 kdambekalns

Good point! Thanks! I always forget that objects can actually be changed in the code and not just via request arguments B)

albe avatar Jul 17 '17 09:07 albe

I think I have found a good approach now on how to tackle this finally.

albe avatar Mar 08 '20 14:03 albe

Since my last "good approach" was still a bit... controversial and not so straightforward, maybe my other solution #2348 is something we can get in. It also introduces a kind of feature (hook into after finished object validation), which might be useful in other cases too, but it wasn't implemented for that. It just wasn't hardcoded onto the PersistentResource to keep it a bit more decoupled.

albe avatar Dec 12 '20 23:12 albe

I stumbled upon this when investigating a (failed) attack through a Neos Form using a File Upload field. I think that this is highly problematic. The default component (File Upload field in Form Builder and FileTypeValidator) allows uploading any file type. Even if it can't be executed (depending on server configuration following best practises) and do any direct harm in the application, you can e.g. post a fake Neos login page and send this link to someone. Or if Neos is running a shop, you could deliver some fake checkout page.

lorenzulrich avatar Jan 09 '22 20:01 lorenzulrich

I once tried a different approach during the development of fusion forms. Did not follow this path yet as i it was easier to aim for feature parity first and improve later.

What i tried was not using a persistent resource as target-type for the field but a custom variant of UploadedDocument that stored the actual data in a Cache and serializes as an object with cache-identifier. That way the uploaded file would be persisted across multiple steps. The conversion to persisted resources would then only be done by the finisher / form-action if really required. That would allow to send files via mail-finisher or pass them to backend services without ever importing them into Neos resources.

mficzel avatar Jan 17 '22 16:01 mficzel

I took some time to dig out my old experiments ... https://github.com/PackageFactory/PackageFactory.CachedFileUploads ... totally experimental. The cache approach actually works nicely.

mficzel avatar Jan 19 '22 22:01 mficzel

A different idea: The typeconverter should not be responsible of persisting data at all - but return a resource object, that can be worked with by whatever is next step in the process.

So the (Persistent)Resource instance is carruing the stream, instead of the resource manager being responsible for that, while in this typeconverter context

On the future run, make it possible to remove the PersistentResource ORM model, but instead work with the UploadedFileInterface from the Psr\Http\Message message directly

sorenmalling avatar Feb 27 '23 12:02 sorenmalling

Actually if your controller / forms expect \Neos\Http\Factories\FlowUploadedFile instead of \Neos\Flow\ResourceManagement\PersistentResource the Automatic conversion des not take place at all.

The only problem with that approach is that the files have to be selected again if the validation failed and the form is rendered again. That is basically what my CachedUploadPackage tries to fix.

mficzel avatar Feb 27 '23 14:02 mficzel