neos-development-collection icon indicating copy to clipboard operation
neos-development-collection copied to clipboard

Split up `NodeAddress` into dedicated classes

Open bwaidelich opened this issue 2 years ago • 15 comments

Currently the NodeAddress in the Neos\Neos\FrontendRouting namespace has a couple of issues:

  • It encapsulates contentStreamId and workspaceName even though the latter can be derived from the former (there are reasons for this, but it makes this object a little bit quirky)
  • To support linking across sites/CRs (#4441) we would have to add yet another property contentRepositoryId to the object even though that information is not relevant for its most important usecase (serializing a node)

Both not major issues, but it turns out that this object is used in many many places already for completely different purposes which it was never meant to be used for:

  • As DTO just to pass contentStreamId, DSP and node id around (~30 cases like this one)
  • To determine whether the node is in live workspace (~10 cases like this one
  • To serialize the node address (~40 cases like this one)

IMO only the last usecase fits the object and it would be worth to introduce dedicated DTOs for each:

  • One for addressing a single node globally
  • One for (de)serializing a node from/to a string/URI

bwaidelich avatar Sep 27 '23 14:09 bwaidelich

Suggestion

Both usecases (addressing and (de)serializing nodes) seem to be very common, so I would suggest to add those new DTOs to the Neos.ContentRepository.Core package:

namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeIdentity {
    public function __construct(
        public ContentRepositoryId $contentRepositoryId,
        public ContentStreamId $contentStreamId,
        public DimensionSpacePoint $dimensionSpacePoint,
        public NodeAggregateId $nodeAggregateId,
    ) {}
}
namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeSerializer {
    public function __construct(
        private WorkspaceName $workspaceName,
        private DimensionSpacePoint $dimensionSpacePoint,
        private NodeAggregateId $nodeAggregateId,
    ) {}

    public static function fromString(string $value): self { /* ... */ }

    public static function fromUri(UriInterface $uri): self { /* ... */ }

    public function toString(): string { /* ... */ }

    public function toQueryString(UriInterface $uri): UriInterface { /* ... */ }
}

And maybe a third one to be able to (de)serialize nodes with their CR id – Alternatively the NodeSerializer could have parameters for this

bwaidelich avatar Sep 27 '23 14:09 bwaidelich

Since the NodeAdress is a concept that will be used frequently by the user we should definitely polish this concept.

Main usecase would be to serialize a node into the html so a javascript application can used it as get parameter and on the server we deserialize it.

mhsdesign avatar Nov 10 '23 13:11 mhsdesign

Main usecase would be to serialize a node into the html so a javascript application can used it as get parameter and on the server we deserialize it.

Just a little side note: It might be tempting to add corresponding methods to the Value Object itself, but I would suggest to keep URL/HTML/JSON,.... specifics out of it and have a separate class for (de)serialization (as suggested above)

bwaidelich avatar Nov 10 '23 14:11 bwaidelich

yes separating this makes perfect sense!

i think we need two (or three) main serialization formats that we would use ourselves:

1. uri parameter serialization ?node={workspace}__{dimensionBase64}__{nodeId}

  • example: ?node=user-admin__eyJsYW5ndWFnZSI6ImVuX1VTIn0%3D__995c9174-ddd6-4d5c-cfc0-1ffc82184677
  • used for routing, as only live nodes have speaking url paths
  • could be used by neos integrators when serializing nodes into the frontend js application
  • deserialization needs crId (which is known through the host http://127.0.0.1:8081 because it maps to as site entity which maps to a content repository id)

2. custom node protocol serialization node://{nodeId}

  • example node://f0eb464e-4701-d909-7023-34f06056bf7c
  • used for simple linking in ck editor
  • deserialization needs crId, dsp, csid

3. NEW fully qualified serialization format possibly json with the fields contentRepositoryId, workspaceName, dimensionSpacePoint and nodeAggregateId

  • example {"contentRepositoryId":"default","workspaceName":"live","dimensionSpacePoint":{"language":"en_US"},"nodeAggregateId":"995c9174-ddd6-4d5c-cfc0-1ffc82184677"}
  • could be used for fusion uncached @cache de(serialization) https://github.com/neos/neos-development-collection/issues/4732
  • could be used by neos integrators when serializing nodes into the frontend js application, especially needed when working with multiple cress
  • deserialization doesnt need anything

4. NEW fully qualified custom node protocol serialization node://{crId}__{workspace}__{dimension}__{nodeId}

  • example node://default__user-admin__eyJsYW5ndWFnZSI6ImVuX1VTIn0%3D__995c9174-ddd6-4d5c-cfc0-1ffc82184677
  • could be used for cross dimension & content repository linking see https://github.com/neos/neos-development-collection/issues/4441
  • deserialization doesnt need anything

mhsdesign avatar Nov 12 '23 11:11 mhsdesign

I don't see why 1, 3 and 4 are separate things, you can transport json via uri perfect fine, and why would you ever want a non fully qualified format. That said 4 (but please more sensible separator character) could work too for me, just that json also works in other "contexts" than a URI.

I alson wonder about 2, the stored uris here need to be fully addressable in terms of the cr, so it needs to contain the nodeAggregate and ContentRepositoryId. How else would you cross link to a node in another cr?

kitsunet avatar Nov 12 '23 11:11 kitsunet

I don't see why 1, 3 and 4 are separate things

Yes, I was wondering the same; can't we just come up with a common serialization format that we use in URIs (and potentially in other places)?

Some central "serializer" class that can convert node addresses to JSON vice versa. For example (slightly different from my suggestion above):

final readonly class NodeSerializer {
    private function __construct() {}

    public static function nodeAddressFromJson(string $json): NodeAddress { /* ... */ }

    public static function nodeAddressToJson(NodeAddress $nodeAddress): string { /* ... */ }
}

By the way: With #4708 we might not need the contentStreamId at all in the NodeAddress! And, related to that: I think we should never have contentStreamId and workspaceName in one DTO.

Question: What about the CR Id that is part of 3. and 4. but not of 1. (NodeAddress)? It could be two separate methods on that serializer above, along the lines of:

// ...

    public static function contentRepositoryIdFromJson(string $json): ContentRepositoryId { /* ... */ }

    public static function NodeAddressAndContentRepositoryIdToJson(NodeAddress $nodeAddress, ContentRepositoryId $contentRepositoryId): string { /* ... */ }

...but I'm not sure about that.

I alson wonder about 2, the stored uris here need to be fully addressable in terms of the cr, so it needs to contain the nodeAggregate and ContentRepositoryId. How else would you cross link to a node in another cr?

The idea was to keep support for: node://<node-id> to resolve nodes of the current CR and add support for node://<cr-id>/<node-id> or node://<node-id>@<cr-id> for cross CR links, but the ConvertUris implementation does not yet support this (see #4441)

bwaidelich avatar Nov 12 '23 12:11 bwaidelich

This suggests to me the NodeAddress should be a more internal thing that is localised to hand addresses around within a defined content repository. for any more "public" use it seems important to know the content repository as well?

As in it should never be serializable because it is incomplete and would always need further context to work?

kitsunet avatar Nov 12 '23 12:11 kitsunet

To make a further point, we already have an absolute serialisation format that is highly specialised, the live URI. This can include the CR hidden in the domain, the dimensions, the nodeAggregate and by virtue of being specialised the workspace. So it's a fine serialization format as it includes all necessary information to deserialize into a specific node.

kitsunet avatar Nov 12 '23 12:11 kitsunet

After our discussion today, I would suggest to replace the contentStreamId from the DTO and refer to the workspaceName instead:

namespace Neos\ContentRepository\Core\SharedModel\Node;

final readonly class NodeIdentity {
    public function __construct(
        public ContentRepositoryId $contentRepositoryId,
        public WorkspaceName $workspaceName,
        public DimensionSpacePoint $dimensionSpacePoint,
        public NodeAggregateId $nodeAggregateId,
    ) {}
}

because we probably don't want to expose the contentStreamId most of the cases (we'll probably still need a separate DTO for this for internal node addressing)

One of the goals would be to replace the Neos\Neos\FrontendRouting\NodeAddress in almost all places.

Another goal is to make mapping of nodes in controllers easier, for example:

final class SomeController extends AbstractController {
  public function __construct(private readonly NodeSerializer $nodeSerializer) {}

  public function someAction(NodeIdentity $nodeIdentity): void {
    $node = $this->nodeSerializer->nodeFromIdentity($nodeIdentity);
  }
}

Note: We won't be able to allow a mapping to the Node read model because it does not contain all required information (e.g. workspace name) – but mapping entities is a very bad practice anyways *g

bwaidelich avatar Feb 09 '24 11:02 bwaidelich

Just to add another note from the previous weekly (9.2.). We discussed that a simple json presentation of the NodeIdentity should replace the current NodeAdress->serializeToUri. That means we do not need any special formats but simple json.

The reasons for this were that delimiting by special characters like _ is always tedious to work with and might even cause odd errors like: https://github.com/neos/neos-development-collection/issues/4345

Instead a simple json string representation like {"contentRepositoryId":"default","workspaceName":"live","dimensionSpacePoint":{"language":"en_US"},"nodeAggregateId":"995c9174-ddd6-4d5c-cfc0-1ffc82184677"} contains all informations and is self describing.

About the possible implications that the contentRepositoryId is over-specified once by NodeIdentity and by being inferred to the urls domain, we agreed that this is merely and aesthetic problem with no technical downsides.

mhsdesign avatar Feb 16 '24 13:02 mhsdesign

@grebaldi and me discussed this matter. We want to keep the encoding short as query param space is limited and also to reduce overhead in the redux store where we load all node identities.

For our main backend neos/content/ for a clear intention we propose that we use actual slashes and routing:

pro

  • easy to read (as developers) and easy to possibly modify (goto node x)

con

  • hard to use (custom flow routing configuration with routing arguments)
localhost:8080/neos/content/default/live/language=en_US&audience=intelligent+people/995c9174-ddd6-4d5c-cfc0-1ffc82184677

alternative as root level query params

localhost:8080/neos/content?contentRepositoryId=default&workspaceName=live&dimensionSpacePoint%5Blanguage%5D=en_US&diemension%5Baudience%5D=nice+people&nodeAggregateId=995c9174-ddd6-4d5c-cfc0-1ffc82184677

And for service endpoints where we want to serialise the node identity as query parameter for simplicity

pro

  • easy to use, simple decode as query parameter into any endpoint

con

  • hard to read and debug, due to the escaping of special characters (only _-.; are allowed)
localhost:8080/neos/service-get-title-of-node?node=default_live_GFuZ3VhZ2U9ZW5fVVMmYXVkaWVuY2U9aW50ZWxsaWdlbnQrcGVvcGxl_995c9174-ddd6-4d5c-cfc0-1ffc82184677

The dimensionspacepoint will be calculated via this expression:

base64_encode(http_build_query(['language' => 'en_US', 'audience' => 'nice people']));

which results in a shorter string than if json_encode is used instead of http_build_query

mhsdesign avatar Mar 15 '24 12:03 mhsdesign

does anyone really need to read this? Also the latter variant somehow contains space characters? implicit ordering might also be dangerous-ish? aaand base64 can add = as padding wihch then is not allowed in get parameters. so I guess rather urlencode is needed....

kitsunet avatar Mar 15 '24 12:03 kitsunet

Actually why ont combine both to some degree?

default/live/language%3Den_US%26audience%3Dintelligent%2Bpeople/995c9174-ddd6-4d5c-cfc0-1ffc82184677

would work both as path part as well as node= GET param. The url encode is strictly not necessary when used as path segment but doesn't harm either. (and I guess the http framework takes care of that anyways so we don't have to).

Yes we need a route part handler for this, but that seems somewhat trivial?

kitsunet avatar Mar 15 '24 12:03 kitsunet

i like the idea of being compatible for both.

what do you say about it:

yield 'no dimensions' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('default'),
        WorkspaceName::forLive(),
        DimensionSpacePoint::createWithoutDimensions(),
        NodeAggregateId::fromString('marcus-heinrichus')
    ),
    'serialized' => 'default/live//marcus-heinrichus'
];

yield 'one dimension' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('default'),
        WorkspaceName::fromString('user-mh'),
        DimensionSpacePoint::fromArray(['language' => 'de']),
        NodeAggregateId::fromString('79e69d1c-b079-4535-8c8a-37e76736c445')
    ),
    'serialized' => 'default/user-mh/language%3Dde/79e69d1c-b079-4535-8c8a-37e76736c445'
];

yield 'two dimensions' => [
    'nodeAddress' => NodeAddress::create(
        ContentRepositoryId::fromString('second'),
        WorkspaceName::fromString('user-mh'),
        DimensionSpacePoint::fromArray(['language' => 'en_US', 'audience' => 'nice people']),
        NodeAggregateId::fromString('my-node-id')
    ),
    'serialized' => 'second/user-mh/language%3Den_US%26audience%3Dnice%2Bpeople/my-node-id'
];

this is only bit weird:

default/user-mh//79e69d1c-b079-4535-8c8a-37e76736c445

dsp is encoded via:

parse_str(urldecode($encoded), $parsed);
return self::instance($parsed);
return urlencode(http_build_query($this->coordinates));

PR: -> https://github.com/neos/neos-development-collection/pull/5067

mhsdesign avatar May 18 '24 15:05 mhsdesign

So we decided for json (see https://github.com/neos/neos-development-collection/issues/5072). And now back to the leftover tasks. With https://github.com/neos/neos-development-collection/pull/4892 the Neos\Neos\FrontendRouting\NodeAddress will be deprecated. Following things will need to be adjusted:

  • [ ] Neos.Neos testing framework. The behat BrowserTrait provides some helpers based on the old node address but is fully unused in 9.0 and not working
  • [ ] Neos.Neos NodeIdentityConverterAspect, will be fixed with https://github.com/neos/neos-development-collection/issues/5069
  • [x] Neos.Neos in a comment of the AssetUsageNodeAddress
  • [x] Neos.Neos in the NodeAddressToStringConverter, but why do we have it either way?
  • [x] Neos.Workspace.Ui (formally Neos.Neos and its WorkspaceController) for node de- and serialisation
  • [x] Neos.Neos content element wrapping ContentElementWrappingService and ContentElementEditableService
  • [x] Neos.Neos data sources DataSourceController
  • [x] Neos.Neos Backend\ContentController::uploadAssetAction
  • [x] Neos.Neos NodesController (what is the usecase???!, for the ui and the link editor to list possible targets?)
  • [x] Neos.Neos.Ui node de- and serialisation, see for example the NeosUiNodeService

mhsdesign avatar Jun 17 '24 18:06 mhsdesign

Thanks for all the discussions over the past year. With the merge of https://github.com/neos/neos-development-collection/pull/5267 we can finally resolve the topic! The old node address is dead long live the node address!

mhsdesign avatar Oct 16 '24 21:10 mhsdesign