fast_xbrl_parser
fast_xbrl_parser copied to clipboard
[BUG] Example cannot be reproduced: "dimensions" output is missing data
Let's take the link given in the official example:
url = "https://www.sec.gov/Archives/edgar/data/1326380/000132638021000129/gme-20211030_htm.xml"
xbrl_dict = fxp.parse(
url,
output=["json", "facts", "dimensions",],
email="...", ### Adjust this to reflect your email address. This is required by the SEC Edgar system when passing a URL.
)
print([
fact
for fact in xbrl_dict["json"]
if fact["context_ref"] == "i12efa46789384fd39ac941da8b7b0f1a_D20200202-20201031"
])
Output:
[{'id': 'id3VybDovL2RvY3MudjEvZG9jOjUzYmQ2ZjI2MzlhMDQ4NjFiYWIwZTVhM2Q2OWFhZjUzL3NlYzo1M2JkNmYyNjM5YTA0ODYxYmFiMGU1YTNkNjlhYWY1M18xMTI0L2ZyYWc6NmJiZGRiOTdhMTExNDdlMmI5ZTc3NDk0N2Y3NGRjOWYvdGV4dHJlZ2lvbjo2YmJkZGI5N2ExMTE0N2UyYjllNzc0OTQ3Zjc0ZGM5Zl8yNDAz_bae81877-b251-4a3e-a936-b94169023ae8',
'prefix': 'us-gaap',
'name': 'DisposalGroupIncludingDiscontinuedOperationGeneralAndAdministrativeExpense',
'value': '1200000',
'decimals': '-5',
'context_ref': 'i12efa46789384fd39ac941da8b7b0f1a_D20200202-20201031',
'unit_ref': 'usd',
'dimensions': [{'key_ns': 'us-gaap',
'key_value': 'DisposalGroupClassificationAxis',
'member_ns': 'us-gaap',
'member_value': 'DiscontinuedOperationsDisposedOfBySaleMember'},
{'key_ns': 'us-gaap',
'key_value': 'IncomeStatementBalanceSheetAndAdditionalDisclosuresByDisposalGroupsIncludingDiscontinuedOperationsAxis',
'member_ns': 'gme',
'member_value': 'SpringMobileMember'}],
'units': [{'unit_type': 'unit', 'unit_value': 'iso4217:USD'}],
'periods': [{'period_type': 'startDate', 'period_value': '2020-02-02'},
{'period_type': 'endDate', 'period_value': '2020-10-31'}]},
{'id': 'id3VybDovL2RvY3MudjEvZG9jOjUzYmQ2ZjI2MzlhMDQ4NjFiYWIwZTVhM2Q2OWFhZjUzL3NlYzo1M2JkNmYyNjM5YTA0ODYxYmFiMGU1YTNkNjlhYWY1M18xMTI0L2ZyYWc6NmJiZGRiOTdhMTExNDdlMmI5ZTc3NDk0N2Y3NGRjOWYvdGV4dHJlZ2lvbjo2YmJkZGI5N2ExMTE0N2UyYjllNzc0OTQ3Zjc0ZGM5Zl8yNDEw_29acdbfa-37af-4a68-93de-33d588e14dc2',
'prefix': 'us-gaap',
'name': 'DiscontinuedOperationTaxEffectOfDiscontinuedOperation',
'value': '-300000',
'decimals': '-5',
'context_ref': 'i12efa46789384fd39ac941da8b7b0f1a_D20200202-20201031',
'unit_ref': 'usd',
'dimensions': [{'key_ns': 'us-gaap',
'key_value': 'DisposalGroupClassificationAxis',
'member_ns': 'us-gaap',
'member_value': 'DiscontinuedOperationsDisposedOfBySaleMember'},
{'key_ns': 'us-gaap',
'key_value': 'IncomeStatementBalanceSheetAndAdditionalDisclosuresByDisposalGroupsIncludingDiscontinuedOperationsAxis',
'member_ns': 'gme',
'member_value': 'SpringMobileMember'}],
'units': [{'unit_type': 'unit', 'unit_value': 'iso4217:USD'}],
'periods': [{'period_type': 'startDate', 'period_value': '2020-02-02'},
{'period_type': 'endDate', 'period_value': '2020-10-31'}]}]
Actual data
dimensions_df = pd.DataFrame(xbrl_dict["dimensions"])
print(dimensions_df[
dimensions_df.context_ref == "i12efa46789384fd39ac941da8b7b0f1a_D20200202-20201031"
])
Output:
cik accession_number xml_name \
54 0001326380 000132638021000129 gme-20211030_htm
context_ref axis_prefix \
54 i12efa46789384fd39ac941da8b7b0f1a_D20200202-20... us-gaap
axis_tag member_prefix \
54 DisposalGroupClassificationAxis us-gaap
member_tag
54 DiscontinuedOperationsDisposedOfBySaleMember
Issue 1: missing rows
IncomeStatementBalanceSheetAndAdditionalDisclosuresByDisposalGroupsIncludingDiscontinuedOperationsAxis row is entirely missing
Issue 2: missing columns
'member_ns': 'us-gaap', 'member_value': 'DiscontinuedOperationsDisposedOfBySaleMember'
are entirely missing.
In documentation example it's using different member_prefix, member_tag column names. Mismatched column names may be the culprit.
Thanks for the bug report! It has been a while since I've worked on this package, so it'll take me a bit longer to get to the bottom of this. But I'll have a look and get back to you.