ingest-file icon indicating copy to clipboard operation
ingest-file copied to clipboard

Create BankAccount entitied from valid IBANs

Open catileptic opened this issue 1 year ago • 0 comments

TODO:

  • [ ] Mentions seem to be missing, ingestigate

As per alephdata/aleph#3908 and #2066, this is an attempt to create BankAccount FTM entities out of valid IBANs.

In the analysis stage, an IBAN is identified by the existing regex. It is added to the list of Mentions.

Then, the IBANs that have been collected as Mentions are validated using schwifty. openiban was also considered, but it performs worse than schwifty. I've listed some test cases below.

BankAccount entities are created for each valid IBAN, and the IBAN string is added to the iban FTM attribute.

When running Aleph locally, after I ingested a test document (attached here), the BankAccount FTM entities appear in FTM-store, but they don't show up in the Aleph UI. This should be investigated further.

Workaround: re-index the investigation containing the IBANs document. Then, go to Entities (in the sidebar) > Add a new entity type > Bank Account. The newly created entities will appear in the table. They do not appear in the sidebar, though.

IBANs.pdf

(iban) ➜  ingest-file git:(feature/iban) ✗ python test_validator.py
V [schwifty] (0) GB33BUKB20201555555555
V [schwifty] (1) GB94BARC10201530093459
V [schwifty] (2) GB94BARC20201530093459
V [schwifty] (3) GB96BARC202015300934591
X [schwifty] (4) GB02BARC20201530093451
X [schwifty] (5) GB68CITI18500483515538
X [schwifty] (6) GB24BARC20201630093459
V [schwifty] (7) GB12BARC20201530093A59
V [schwifty] (8) GB78BARCO0201530093459
V [schwifty] (9) GB2LABBY09012857201707
V [schwifty] (10) GB01BARC20714583608387
V [schwifty] (11) GB00HLFX11016111455365
V [schwifty] (12) US64SVBKUS6S3300958879
V [schwifty] (13) NL63INGB5198491756
V [schwifty] (14) RO65PORL4435312861931963
V [schwifty] (15) SA7439228561548156293899
V [schwifty] (16) AE560335386651248739596
V [schwifty] (17) ES7401283747341413374686

schwifty got 3 / 18 IBANs wrong
V [openiban] (0) GB33BUKB20201555555555
V [openiban] (1) GB94BARC10201530093459
V [openiban] (2) GB94BARC20201530093459
V [openiban] (3) GB96BARC202015300934591
X [openiban] (4) GB02BARC20201530093451
X [openiban] (5) GB68CITI18500483515538
X [openiban] (6) GB24BARC20201630093459
X [openiban] (7) GB12BARC20201530093A59
X [openiban] (8) GB78BARCO0201530093459
V [openiban] (9) GB2LABBY09012857201707
X [openiban] (10) GB01BARC20714583608387
X [openiban] (11) GB00HLFX11016111455365
X [openiban] (12) US64SVBKUS6S3300958879
V [openiban] (13) NL63INGB5198491756
V [openiban] (14) RO65PORL4435312861931963
V [openiban] (15) SA7439228561548156293899
V [openiban] (16) AE560335386651248739596
V [openiban] (17) ES7401283747341413374686

openiban got 8 / 18 IBANs wrong

catileptic avatar Aug 02 '23 12:08 catileptic