phpcr-odm
phpcr-odm copied to clipboard
allow to add uuid as id to proxies
As described in:
https://gist.github.com/ElectricMaxxx/68a7d033a6357e757329
i would wish to create proxies (after calling $dm->getReference('MyClassName', 'uuid');
) with the uuid as the id. This PR will make it possible that the request on the backend is able to handle the uuid.
Normally $session->getNode($id);
will fetch the node. Now there will be a check if the $id
is an uuid and fetch the node by $session->getNodeByIdentifier($id);
, which is able to look for a node by its uuid.
I tried to create more tests for those documents that live as proxies in the UoW, but i do not get it to solve the mocking and i do not know if it is clever to do so, cause if i would mock some implementation depending, we would lose the decoupled view on the backend.
So when the tests are green, i would say that's it! I testet that changes in my library an now it works, I would be so happy if somebody would merge it.
Should i add a chanLog entry?
Cause: @dantleech is right, this new feature can be confusing. But not more confusing then using find()
with the absolute path and uuid.
What does that really change?
DocumentManager::getReference(className, id)
is usable now with className and absolutePath (works fine before) and className and Uuid now. When the document is refreshed this mapping is redone for the identityMap registration.
Ping @dbu time to review or merge?
I will try and have a proper look tomorrow.
i think your idea makes sense. however, i fear its a bit more complicated:
- i could create the proxy with uuid, then find a document by path or uuid. i think in that case i now end up with duplicated documents.
- i could find a document by path and then create the proxy with uuid. this would also create two documents for the same path.
i think we need:
- a map of uuid => path
- a uuidProxy list or something where we keep the uninitialized proxies by uuids so they don't mix up with documents in identityMap
- whenever we create a document and have the node, we check if the uuid of the node already is in uuidProxy list and if so pick the proxy from there, initialize and stuff into identityMap
- whenever we create a proxy with uuid, we check if its already in map of uuid => path and if so create it for the path instead.
i think there is still room for an accident, which is creating a proxy with uuid and then a proxy with path that are both the same node. i don't think we can avoid this without loading the node whenever creating a proxy by uuid, which defies the purpose of the proxy.
@dbu you mean a duplicate document would occur, when somebody creates a proxy by
$dm->getReference(..., uuid);
and will try to fetch a document later by $dm->find(..., $id)
, or in the other direction?
The problem is that the referenced document by uuid can't be detected in the identityMap. OK i understand, and will fix that.
the situation currently would occur for both directions: first getReference, then find. but als first find, later getReference. both can be fixed with the approach i proposed (and maybe there is something better).
the last case probably can't be avoided, that when we getReference twice, once with uuid and once with path, and the uuid ref is not resolved yet, we end up with duplicates. we could detect it when the second reference is resolved and throw an exception to tell the user something is going wrong here.
You mean trying to resolve a proxy on both ways in the same UoW shouldn't be allowed? Ok that makes sense to and is an other reason for proxyMap. Question would be: When to bring its entries to the normal identityMap? When fetching the orig. document, maybe? So do not take care on processes like commit later. I just would need to take care on that map while find()/getReference() on the document manager. But instead of adding an other if-clause into that $dm->find()` i think i will refactor that method complete by a fetching method on UoW. Reason: the find method makes one query to much:
if (UUIDHelper::isUUID($id)) {
try {
$id = $this->session->getNodeByIdentifier($id)->getPath();
} catch (ItemNotFoundException $e) {
return null;
}
} elseif (strpos($id, '/') !== 0) {
$id = '/'.$id;
}
$document = $this->unitOfWork->getDocumentById($id);
when that document is still scheduled, just to know the id.
you are right yes. or you could just create a getPathForUuid to UoW and call that. maybe with a flag parameter to tell if you want to load the node or not if its not known yet.
i think the case that does not work is when you create first both types of proxy (without first resolving one, as then you would detect on the second that you already have it) and then resolve both proxies, you will notice the clash only on the second resolution. and then you can not merge them anymore as they are separate objects that can be referenced from any place.
Will try it that evening.
As reminder for me: Found an other place where $this->session->getNode($id);
is used where the $id can be both: inside the executeRemove()
, there the node is fetched to delete it.
Will change that one to, but does the session have got no removeNode()
or removeNodeByIdentifier
as pendant to the getters?
the phpcr session does $node->remove(), or you can use Workspace::removeItem but only with a path.
i guess we really want methods on the uow to map between uuid and path and to be sure you have the path instead of all the if isUUID all over the place.
Btw: Why does the PHPCR does not have that persister mechanism the ORM has? I know that we do the persist operation on the session and the session is a kind of that, but would be able to hide all those stuff in a persister like that:
class Persister
{
public function getNode($identifier){}
public function removeNode($identfier){}
}
By doing that we would get rid of all decision $id/$uuid inside of the UoW. Either a document is stored by id or by uuid inside of the identityMap. The array doesn't matter about the different meanings of the key. We just need to train getDocumentById(), registerDocument(), ... to handle both. I think the UoW is consistent with calling those methods
does orm 2.1 already have this Persister? if not then probably this got refactored after people built phpcr-odm inspired by the orm. if we can refactor a sane part out of the UoW i am very +1 to do that, the UoW is an unwieldy beast and very hard to read and does way too many things.
this might turn into a larger undertaking, but the proxy by uuid is only going to make it into 1.2 anyways.
If somebody just wanna see the reason for this PR just have a look at my failing tests at: https://github.com/ElectricMaxxx/DoctrineOrmOdmAdapter/pull/15
But i will go on my work to fix that now :-)
Soooo, .. i think (and hope) i got it:
- did a little refactoring on
DocumentManager::find()
. One part is just code duplication thing the other one should improve performance. Still mapped documents will be returned first. - replaced all occurrences of
$this->session->getNode($this->getDocumentId($document));
by the givengetNodeForDocument($document)
, which accepts objects and its hashes now. They still have to be mapped. - in that method i switched uuid-mapped documents to id-mapped documents only for sure
I will shortly look for session calls, which don't like uuid's too (removeNode()).
Got them all. Almost all getNode($id) are supported by getNodeByIdentifier($id) for possible uuid. Most of them (13 of 17 occurrences) are done by `DocumentManager::getNodeForDocument' 2 more are done in UoW directly. I will be unrelevant if a document is mapped by its id or its uuid.
i think you are missing the logic to juggle the uuid around. in getOrCreateProxy, you would need to add the proxy to this list of proxies by uuid. that would also avoid you to unregister and re-register. you would simply move the proxy from the proxy by uuid map to the normal map, but checking first if there already is a document at that place which would mean things went wrong.
at the same time you need to keep track of all uuids for documents we know about, so that you can detect if an existing document is asked for in getOrCreateProxy and return that instead of creating a proxy.
Yea i am lost in the jungle .... :-)
The reason why i don't wanted to map the uuid referenced documents in an extra map was:
The getNodeForDocument($document)
method makes me completely independent from the question if a document is references by uuid or the path.
I just think at the moment when everything is flushed or need to be scheduled only. I would have to look for some uuid referenced and path referenced documents. As all documents are referenced by its oid in the schedule array they are independent from the decision.
I would like to have a clear interface structure to fetch/register/unregister Documents from the identityMap which should hide the decision for the identifier and that one method to fetch the node by a document, which uses that methods i said before. I would say :-1: for introducing a second map and try to sync them. When using those methods the referencing key can be everything (i.e. birthday of my grandmother) we just need to take care of it, when doing stuff like $this->session->getNode()
. And then we are in that situation to say: :+1: for the persister first. I would just need a chance to fetch that code here for my project. Maybe you can merge this on an own branch for me? Cause the implementation of persisters would take some days more.
The major problem is: UoW and DocumentManager arn't covert with test, so i can do lots of stuff with green tests (as you can see here) and would never recognize it.
the UoW has 89.06% of the lines covered. DocumentManager has 80%. thats not that bad i think, but i am sure we miss important bits. and especially the error handling and edge condition tests are seriously lacking.
i am still unsure if stuffing uuids into the map that is by document id is a good idea, but lets refactor to this persistor first, following orm as closely as possible. once we have that, the whole code should be more readable and we can see what makes sense or not.
i guess step 1 would be to make sure we have functional tests that cover the methods we are going to touch, to be sure we don't break things. sorry, this is kind of growing and growing. i can try to help with the functional tests if you can list me the methods of UoW that will move to the Persister.
regarding your project needing the code, you can do https://getcomposer.org/doc/05-repositories.md#loading-a-package-from-a-vcs-repository to point composer to your fork, then alias your branch to 1.1.0. (just take care to update composer once this is merged, before deleting the branch in your fork ;-)
Ok, i will do so with my project.
I will try to get an overview tonight and have a closer look what the ORM had done.
@ElectricMaxxx do you want to pick up on this again? seems there are still quite a few loose ends. and it would need a rebase on master, its in conflict.
@dbu thougt we wanted to refactor the complete document-/identiymap storage mechanism first, to get more control about the document.
Think at the end you had some bad feelings if we do not loose a document when it is referenced by its uuid or path. But i think that changes i made in this PR touch a very little part of the paths a document can go through the UoW, that we can use it as it is (just with a rebase to solve merge conflicts). I use that way since weeks in my project (referenced my fork) and never had issues.
The other think I wrote above: lets have a meeting for a UoW-Hack-Weekend to refactor it in one of the future versions. I have many ideas but that less time atm.
yeah, its very very tricky if the same document can be identified with both the uuid and the path. if you want to rebase, lets be sure if we can address all i say in https://github.com/doctrine/phpcr-odm/pull/493#issuecomment-43854355 first.
the hack-weekend sounds like a fun project. only i am very short of weekends... will send you a personal email to discuss this further.
so +1 for wrapping this up. but the builds for jackrabbit fail (seems a confusion between simple and full versionable) and there is a lot of open comments. do you want to pick this up again @ElectricMaxxx
Yes. This would need a rebase too.
Mit freundlichen Grüßen
Maximilian Berghoff
Maximilian Berghoff Wiesenstraße 44 91617 Oberdachstetten
Mail: [email protected] Mobile: +49 151 64825096
Am 18.12.2014 um 10:16 schrieb David Buchmann [email protected]:
so +1 for wrapping this up. but the builds for jackrabbit fail (seems a confusion between simple and full versionable) and there is a lot of open comments. do you want to pick this up again @ElectricMaxxx
— Reply to this email directly or view it on GitHub.
I think instead of rebasing 100 commits from master i will try to extract my changes in a new commit.
you could squash those commits git rebase -i HEAD~14
and then
cherry-pick that commit to a new branch you start from master.
@ElectricMaxxx hm, seems we never wrapped this one upny chance you can clean it up?
@ElectricMaxxx any chance we can wrap this one up?
@ElectricMaxxx do you think we can wrap this one up for 2.0 as well?