DocumentAI : invoice parser result method to retrieve fields
Context : using Document AI PHP library to parse an invoice and retrieve the fields detected by the API OCR engine with invoice parser.
Issue : I am not able to extract the fields from the response. Either there is no method available or the method isn't documented.
I have checked documentation here : https://googleapis.github.io/google-cloud-php/#/docs/cloud-document-ai/v0.1.2/documentai/readme
But I can't find the right method to extract fields described here : https://cloud.google.com/document-ai/docs/processors-list#processor_invoice-processor
I have used : $result = $operationResponse->getResult();
but then $result contains a huge list of things including the document itself and I can't find what I am looking for, even by doing a var_dump. Please, is there any method available or a description of the API result structure in the documentation?
Hello,
It would be good to understand if this is just a doc issue or a limitation in the current library ? Is this library actively supported ?
Thank you
Hi @Firebird75 Apologies for the late reply on this issue.
But I checked the samples for DocumentAi from the second link that you provided.
In the quickstart I can see the following snippet:
$response = $client->processDocument($name, [
'rawDocument' => $rawDocument
]);
# Print Document Text
printf('Document Text: %s', $response->getDocument()->getText());
Likely you've already resolved this or circumvented this issue. But just keeping it here for anyone who sees this issue.
Closing this for now, but please feel to reopen this if there is still something that you need clarity over.
Thanks
Hello @saranshdhingra
Thank you for taking the time to reply to this issue.
What I am missing is a more detailed explanation on how to get the fields found by the Document AI invoice processor.
Basically, the invoice processor has certainly a method to retrieve fields like invoice number, total price, quantity, etc... But I don't know what method to use or how to parse the results. The structure of the result isn't documented (or I haven't found it).
Thank you very much !
Sorry @saranshdhingra but I can't reopen the issue as you have closed it. I am not allowed to do that. Do you want me to open a new issue to track this please ?
Hi @Firebird75 I checked a little, but I couldn't find custom getters for specific fields.
What I did try is that you could get all the entities that a processor has parsed, using something like this:
$response = $client->processDocument($name, [
'rawDocument' => $rawDocument
]);
foreach($response->getDocument()->getEntities() as $key => $entity) {
$type = $entity->getType();
$txt = $entity->getMentionText() ;
}
The full list of getters for an Entity can be found here.
If you have seen any reference of custom getters for each processor, please let me know.
Thanks.
Hi. I'll be closing this for now.
Please feel free to reopen this if more conversation is needed.