smart_open icon indicating copy to clipboard operation
smart_open copied to clipboard

ResourceWarning: unclosed file

Open kenahoo opened this issue 4 years ago • 10 comments

Problem description

I'm seeing a ResourceWarning: unclosed file warning when using context managers to open files/streams with smart_open.

Note: if I don't run under unittest, or if I don't gzip the file, resources seem to be closed correctly. I'm guessing unittest and smart_open are somehow not coordinating correctly on closing the layers.

Steps/code to reproduce the problem

My test script:

import unittest
from smart_open import open as smart_open


class MyTestCase(unittest.TestCase):
    def test_load(self):
        with smart_open('input.csv.gz') as fh:
            print("opened file")


unittest.main()

Invocation:

% echo -e 'col1,col2\nval1,val2\nval3,val4' | gzip > input.csv.gz 

% PYTHONTRACEMALLOC=1 python test_load.py
opened file
/Users/kwilliams/miniconda3/lib/python3.7/unittest/case.py:615: ResourceWarning: unclosed file <_io.BufferedReader name='input.csv.gz'>
  testMethod()
Object allocated at (most recent call last):
  File "/Users/kwilliams/git/dispatcher/rush-springs-simulations/venv/lib/python3.7/site-packages/smart_open/smart_open_lib.py", lineno 548
    fobj = io.open(parsed_uri.uri_path, mode)
.
----------------------------------------------------------------------
Ran 1 test in 0.007s

OK

Versions

Darwin-18.0.0-x86_64-i386-64bit
Python 3.7.3 (default, Mar 27 2019, 16:54:48) 
[Clang 4.0.1 (tags/RELEASE_401/final)]
smart_open 1.9.0

kenahoo avatar Nov 26 '19 20:11 kenahoo

I should add - I'm not sure whether the warning is correct and the filehandle isn't being closed properly, or it's a spurious warning.

kenahoo avatar Nov 26 '19 20:11 kenahoo

Hi, any thoughts on this?

kenahoo avatar Dec 04 '19 05:12 kenahoo

@kenahoo thanks for the clear and detailed report. I agree context managers should be closing handles, so that looks like a bug.

@mpenkov is busy ATM – any chance you could take a stab at this yourself?

piskvorky avatar Dec 04 '19 08:12 piskvorky

Hi @piskvorky - I'm afraid I probably won't be able to tackle this, mostly because I had a look at the guts of smart_open and I think I'm not up to the task at this point, but also because this is coming up in my "day job" and the deadlines are pretty tight, so I'm not able to commit the necessary time, at least in the short term.

kenahoo avatar Dec 10 '19 18:12 kenahoo

I had the same issue when trying to open a gz file, when I run unitests the same warning has been shown. I'm using smart-open==4.2.0 and Python 3.8.7

The warning message:

/mnt/c/Users/<user>/workspace/codes/tests/testXX.py:242: ResourceWarning: unclosed <ssl.SSLSocket fd=5,
 family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('yyy.yyy.yyy.yyy', XXXX),
 raddr=('zzz.zzz.zzz.zzz', 8080)>
ResourceWarning: Enable tracemalloc to get the object allocation traceback

danielpazeto avatar Mar 02 '21 16:03 danielpazeto

Anyone been able to fix this?

Pim-Claessens avatar Aug 03 '22 10:08 Pim-Claessens

Cannot reproduce on linux Python 3.10.6 and smart_open 6.1.0.

mpenkov avatar Aug 21 '22 12:08 mpenkov

I had the same issue when trying to open a file on the s3, whan I run unitests the same warning has been shown. My test script:

import smart_open
import unittest
 
class RunTest(unittest.TestCase):
    def test_load_pickle_s3(self):
        path = "s3://my_test_direcotry/test.pkl"
        with smart_open.open(path, "wb") as fh:
            print("open file")

    def test_load_pickle_local(self):
        path = "test.pkl"
        with smart_open.open(path, "wb") as fh:
            print("open file")

Package version: smart_open[s3] 6.3.0 Python 3.9.16

art12-3ds avatar Feb 28 '23 09:02 art12-3ds

Are you able to work out the cause?

mpenkov avatar Feb 28 '23 11:02 mpenkov

Not at the moment

art12-3ds avatar Mar 01 '23 13:03 art12-3ds