py-xbrl icon indicating copy to clipboard operation
py-xbrl copied to clipboard

parse_ixbrl does not close the file it opens

Open cdm-analytics opened this issue 2 years ago • 4 comments

The function, "parse_ixbrl" in xbrl/instance.py opens a file with the id "instance_file" and doesn't close the file.

I am looping through large number of XBRL documents and running into an issue where I am hitting a system limit on open files.

As a work-around, I increased the system limit on open files. That helped but now I am running into memory issues.

I wonder if using "with" or explicitly closing the file would eliminate the issue I'm having.

Great module! Thanks!

cdm-analytics avatar Nov 22 '23 18:11 cdm-analytics

OK, I think I fixed both problems I was having with the following code change to the "parse_ixbrl" function in xbrl/instance.py:

    instance_file = open(instance_path, "r", encoding=encoding)
    contents = instance_file.read()
    instance_file.close()
    pattern = r'<[ ]*script.*?\/[ ]*script[ ]*>'
    contents = re.sub(pattern, '', contents, flags=(re.IGNORECASE | re.MULTILINE | re.DOTALL))
    with StringIO(contents) as contents_object:
        root: ET.ElementTree = parse_file(contents_object)

The "close" line fixed the issue I had with opening too many files and the "with" code fixed the issue I had with maxing out my system's memory.

cdm-analytics avatar Nov 22 '23 22:11 cdm-analytics

Ah, so you added the with block, right? Will investigate this for next major release. This is probably also relevant for the parse_xbrl function and not just parse_ixbrl. Thanks for the suggestion.

manusimidt avatar Nov 26 '23 07:11 manusimidt

I also added the instance_file.close() line which could alternatively be done using a with block.

Thanks for the follow up! It's a great module that has been really helpful to me.

cdm-analytics avatar Nov 26 '23 20:11 cdm-analytics

@cdm-analytics how about also using with instead of instance_file.close() - ?

s-kust avatar Jan 07 '24 16:01 s-kust