portfolio icon indicating copy to clipboard operation
portfolio copied to clipboard

Fix issue #3991 to be able to import French Transactions Degiro PDF

Open couclock opened this issue 9 months ago • 3 comments

Issue: https://github.com/portfolio-performance/portfolio/issues/3991

Describe the bug Using latest release of Portfolio Performance, I'm facing issue when I try to import Degiro Pdf of my latest transactions. Few transactions are not valid. Consequently, I'm not able to import my transactions.

To Reproduce

  • Export Transactions from Degiro in Pdf format
  • Import PDF file in Portfolio Performance: File > Import > PDF Banks Document
  • Pdf is not imported because of some invalid lines (See attached screenshot)

Here is what I get when I use File > Import > Debug: Create text from PDF ... :

PDFBox Version: 1.8.17
Portfolio Performance Version: 0.68.4
-----------------------------------------
Transactions du 01-03-2024 au 03-05-2024
Date Heure Produit Code ISIN Place Lieu Quantit Cours Montant devise Montant Taux de Frais deboursiè d'exécution é locale change courtage Montant négocié
16-04-2024 21:56 ADR ON AMBEV US02319V1035 NSY CDED -150 2,265 USD 339,75 USD 319,95 EUR 1,0619 -2,00 EUR 317,95 EUR
03-04-2024 10:21 ISHARES PHYSICAL GOLD IE00B4ND3602 XET XETA 25 40,986 EUR -1 024,65 EUR -1 024,65 EUR -3,00 EUR -1 027,65 EUR
ETC
02-04-2024 21:04 EMERSON ELECTRIC COMPA US2910111044 NSY XNAS 4 113,05 USD -452,20 USD -420,11 EUR 1,0764 -2,00 EUR -422,11 EUR
02-04-2024 21:03 VISA INC. US92826C8394 NSY BATS 3 278,35 USD -835,05 USD -775,79 EUR 1,0764 -2,00 EUR -777,79 EUR
02-04-2024 21:02 ALPHABET INC. - CLASS A US02079K3059 NDQ CDED 3 153,96 USD -461,88 USD -429,14 EUR 1,0763 -2,00 EUR -431,14 EUR
02-04-2024 20:58 MASTERCARD US57636Q1040 NSY BATS 1 478,9 USD -478,90 USD -444,91 EUR 1,0764 -2,00 EUR -446,91 EUR
INCORPORATE
26-03-2024 10:53 ISHARES CORE S&P 500 IE00B5BMR087 XET XETA 6 507,64 EUR -3 045,84 EUR -3 045,84 EUR -1,00 EUR -3 046,84 EUR
UCITS ETF USD (ACC)
26-03-2024 10:27 ISHARES PROP US IE00B1FZSF77 EAM XAMS -4 25,25 EUR 101,00 EUR 101,00 EUR EUR 101,00 EUR
26-03-2024 10:17 SPDR RUSSELL 2000 US IE00BJ38QD84 XET XETA -34 54,95 EUR 1 868,30 EUR 1 868,30 EUR -1,00 EUR 1 867,30 EUR
SMALL CAP UCITS ETF
26-03-2024 10:13 ISHARES PROP US IE00B1FZSF77 EAM XAMS -12 25,25 EUR 303,00 EUR 303,00 EUR -3,00 EUR 300,00 EUR
26-03-2024 10:05 VANGUARD USDTRBOND IE00BZ163M45 EAM XAMS -32 19,925 EUR 637,60 EUR 637,60 EUR -1,00 EUR 636,60 EUR
13-02-2024 20:49 AGNICO EAGLE MINES LIM CA0084741085 NSY CDED -10 44,71 USD 447,10 USD 417,62 EUR 1,0706 -2,00 EUR 415,62 EUR
07-02-2024 09:05 ISHARES PHYSICAL GOLD IE00B4ND3602 XET XETA 27 36,774 EUR -992,90 EUR -992,90 EUR -3,00 EUR -995,90 EUR
ETC
22-01-2024 17:18 PAYPAL HOLDINGS INC. US70450Y1038 NDQ SOHO 3 65,475 USD -196,43 USD -180,42 EUR 1,0887 -2,00 EUR -182,42 EUR
22-01-2024 16:56 ADR ON BRITISH AMERICAN US1104481072 NSY CDED 7 29,91 USD -209,37 USD -192,35 EUR 1,0885 -2,00 EUR -194,35 EUR
TOBACCO PLC
22-01-2024 16:51 UNITEDHEALTH GROUP INC US91324P1021 NSY XNAS 1 508,4 USD -508,40 USD -466,94 EUR 1,0888 -2,00 EUR -468,94 EUR
22-01-2024 16:31 ELEVANCE HEALTH INC US0367521038 NSY BATS -1 469 USD 469,00 USD 430,59 EUR 1,0892 -2,00 EUR 428,59 EUR
22-01-2024 16:30 MICRON TECHNOLOGY INC US5951121038 NDQ XNAS -4 88,31 USD 353,24 USD 324,22 EUR 1,0895 -2,00 EUR 322,22 EUR
flatexDEGIRO Bank Dutch Branch, qui opère sous le nom de www.degiro.fr Compte
DEGIRO, est la succursale néerlandaise de flatexDEGIRO Bank AG. [email protected]
flatexDEGIRO Bank AG est principalement supervisée par le
régulateur financier allemand (BaFin). Aux Pays-Bas, flatexDEGIRO Amstelplein 1 1096 HA 2024-05-03
Bank Dutch Branch est enregistrée auprès de la DNB et supervisée Page 1 / 1
par l'AFM et la DNB.

After analyzing errors using source code:

1st error with lines

02-04-2024 20:58 MASTERCARD US57636Q1040 NSY BATS 1 478,9 USD -478,90 USD -444,91 EUR 1,0764 -2,00 EUR -446,91 EUR
22-01-2024 16:31 ELEVANCE HEALTH INC US0367521038 NSY BATS -1 469 USD 469,00 USD 430,59 EUR 1,0892 -2,00 EUR 428,59 EUR

The total amount (after share count) do not have expected decimal count {2,6} but only one and even none.

2st error with line 26-03-2024 10:27 ISHARES PROP US IE00B1FZSF77 EAM XAMS -4 25,25 EUR 101,00 EUR 101,00 EUR EUR 101,00 EUR The fee amount is not set, we can only see fee currency. This is because the transaction has been splitted and the fee has been charged only once.

This PR is fixing all mentionned issues.

Expected behavior I expect all Transactions PDF generated from Degiro website can be imported smoothly in PortfolioPerformance.

Screenshots Find attached a screenshot showing the error I get. screenshot_error

Second screenshot show the PDF as exported by Degiro. screenshot_PDF_file_to_import

Desktop (please complete the following information):

  • OS: Linux Ubuntu
  • Version: 0.68.4

couclock avatar May 04 '24 18:05 couclock

Hello @couclock Thank you for the pull request. That looks good. 👍🏻

We always check all transactions in a PDF debug as well as all securities that are determined. Could you please add these?

One more question... what is the "id" responsible for or how should it be processed further?

Regards Alex

Nirus2000 avatar May 07 '24 10:05 Nirus2000

We always check all transactions in a PDF debug as well as all securities that are determined. Could you please add these?

Do you mean in UT (name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/pdf/degiro/DegiroPDFExtractorTest.java) ? If so, sure, I'll do it.

One more question... what is the "id" responsible for or how should it be processed further?

This "id" attribute on Section is the only way I found to identify which section (and especially its regex) is applied on a line. Without it, it's almost impossible to find the right section by comparing only regex that are so close.

Thanks a lot for your first feedback.

PS: I noticed "Account statement" PDF is not yet supported in French, do you know if anyone is working on it ?

couclock avatar May 07 '24 11:05 couclock

Nice one, I think the "id" would be helpful in the future to reference to a regex. As you mentioned finding the exact regex is a needle in a haystack.

Nonnonnoo avatar May 07 '24 15:05 Nonnonnoo

Hello @couclock

Do you mean in UT (name.abuchen.portfolio.tests/src/name/abuchen/portfolio/datatransfer/pdf/degiro/DegiroPDFExtractorTest.java) ? If so, sure, I'll do it.

Yes, please. 👍🏻 After adding the missing test cases, we can transfer them to the master branch.

Nirus2000 avatar May 13 '24 17:05 Nirus2000

@couclock writes:

This "id" attribute on Section is the only way I found to identify which section (and especially its regex) is applied on a line. Without it, it's almost impossible to find the right section by comparing only regex that are so close.

I like the idea of the "id". If include in failure messages, it allows to quickly go to the code.

However, at the moment

  • the 'id' is never used anywhere in the code
  • there are duplicated 'id' (for example withoutExchangeRate-withoutFee-withoutStockExchangePlace) which defeats the purpose

My proposal is to a) print the id as part of the exception and b) make sure the id is unique. I'll add another commit.

buchen avatar May 25 '24 07:05 buchen

merged

buchen avatar May 25 '24 07:05 buchen