smart_open icon indicating copy to clipboard operation
smart_open copied to clipboard

expected output of write() method

Open thoroc opened this issue 4 years ago • 1 comments

Problem description

I was expecting a returned value for the write() method when testing with a local filesystem, as I would using open() from the standard lib. It was raised before (see: https://github.com/RaRe-Technologies/smart_open/issues/231), but my test are showing otherwise.

Steps/code to reproduce the problem

Run the following script. It contains writing using pathlib, the regular open() and smart_open()

from pathlib import Path
import smart_open as so

# declaring variables
FILE_CONTENT = """As we know, there are known knowns; there are things we know we know. 
We also know there are known unknowns; that is to say we know there are some things we do not know. 
But there are also unknown unknowns—the ones we don’t know we don’t know."""
DIR_PATH = 'file_io'
FILE_INPUT = 'input.txt'
FILE_OUTPUT = 'output.txt'

# create the input/output dir
Path(DIR_PATH).mkdir(exist_ok=True)

# (re)create input file
pi = Path(DIR_PATH, FILE_INPUT)
if pi.exists():
    pi.unlink()

pi.touch()

print('\nCreate input file with content\n')

with pi.open(mode='w'):
    pi.write_text(FILE_CONTENT)

text = pi.read_text()

for line in text.splitlines():
  print(line)

print(f'Text length={len(text)}')

# create output file
po = Path(DIR_PATH, FILE_OUTPUT)
if po.exists():
    po.unlink()

po.touch()

print('\nUsing pathlib')

po.open(mode='a').write('PathLib\n\n')
byte_count = po.open(mode='a').write(pi.read_text())

print(f'Byte Count={byte_count}')

print('\nUsing open')

byte_count = 0
with open(po, mode='a') as fo:
    fo.write('\n\nStd Open\n\n')
    with open(pi, mode='r') as fi:
        for line in fi.readlines():
            byte_count += fo.write(line)

print(f'Byte Count={byte_count}')

print('\nUsing smart_open')

byte_count = 0
with so.open(po, mode='a') as fo:
    fo.write('\n\nSmart-Open\n\n')
    with so.open(pi, mode='r') as fi:
        for line in fi.readlines():
            fo.write(line)
            # work around to get the byte_count as fo.write() returns None
            byte_count += len(line)

print(f'Byte Count={byte_count}')

Versions

smart-open==1.9.0

Output


Create input file with content

As we know, there are known knowns; there are things we know we know. 
We also know there are known unknowns; that is to say we know there are some things we do not know. 
But there are also unknown unknowns—the ones we don’t know we don’t know.
Text length=245

Using pathlib
Byte Count=245

Using open
Byte Count=245

Using smart_open
Byte Count=245

Checklist

Before you create the issue, please make sure you have:

  • [x] Described the problem clearly
  • [x] Provided a minimal reproducible example, including any required data
  • [x] Provided the version numbers of the relevant software

thoroc avatar Feb 26 '20 14:02 thoroc

Thank you for creating such a well-documented issue. Are you able to make a PR?

mpenkov avatar Mar 08 '20 07:03 mpenkov