opentaxforms icon indicating copy to clipboard operation
opentaxforms copied to clipboard

New IRS transcripts are xhtml and so much more parseable than PDFs

Open llorracc opened this issue 4 years ago • 1 comments

In principle, it has been possible for several years to get a "transcript" of basically everything the IRS knows (that is, has received electronically) about your tax info.

In practice, this seems to have been created mostly for the convenience of tax preparation professionals; but, having created the system, the IRS apparently felt they had to make it at least possible for the taxpayer to get their own information. But it was so painful (IRS invented its own security protocols, like checking whether an address you provided matched something they could download from EquiFax) that nobody did it.

Since last year (I don't know exactly when), they have modernized the system. Not only have they moved to standard 2fa authentication methods, but they have made the resulting documents xhtml rather than (bitmapped) PDFs -- so now it should be possible for someone ambitious to use standard python tools to calculate your taxes for you; without the pain of TurboTax and its competitors.

Maybe this will (finally) get the US to a point where you can feed your tax info (downloaded directly from the IRS) into an open-source package that will calculate your taxes for you (up to the same degree of accuracy the IRS can obtain, because you have the same info that they do).

llorracc avatar Jul 15 '20 02:07 llorracc

Thanks for this info, I was not aware. I found that all of my available transcript pdf's are bitmapped, maybe because I don't e-file.

jsaponara avatar Aug 11 '21 08:08 jsaponara