dimod icon indicating copy to clipboard operation
dimod copied to clipboard

`BQM.into_file` or similar

Open arcondello opened this issue 4 years ago • 4 comments

Right now this can be done with

with open(.., 'wb') as f:
    shutil.copyfileobj(bqm.to_file(), f)

but it would be nice to be able to just do

with open(.., 'wb') as f:
    bqm.into_file(f)

and avoid the creation and copy of the intermediate file.

arcondello avatar Jul 15 '21 18:07 arcondello

Alternatively, extend .to_file() to accept optional fd where it would write instead of to a SpooledTemporaryFile.

randomir avatar Jul 27 '21 22:07 randomir

I like the explicitness, plus it avoids the weirdness of

f = open('test.bqm')
f2 = bqm.to_file(f)
assert f is f2

though of course we do use that pattern everywhere.

One last thing, I might expect into_file to not seek back to the beginning, while to_file I would expect the current position to be the start.

Nothing that documentation cannot solve but :shrug:

arcondello avatar Jul 27 '21 23:07 arcondello

Relevant to this discussion, worth noting that unlike BQM.to_file(), json.dump does not seek back to the beginning. So

import io
import json

with io.StringIO() as f:
    json.dump({'a': 1}, f)
    print(f.tell())

prints 8. Whereas

import dimod

bqm = dimod.BQM.from_ising({'a': 1}, {})
with bqm.to_file() as f:
    print(f.tell())

prints 0.

On the one hand this inconsistency bothers me. On the other hand, we use it for nice syntax like https://github.com/dwavesystems/dwave-cloud-client/blob/b0d3182507d6309d3e531cf95f5b8e90ee6c06f3/dwave/cloud/coders.py#L397 which is nicer than needing to do

data = bqm.to_file()
data.seek(0)

before upload.

This becomes especially relevant soon for when we add the dual of https://github.com/dwavesystems/dimod/pull/958, CQM.to_lp_file() which I would likely expect to also seek back to the beginning.

arcondello avatar Nov 22 '21 23:11 arcondello

It's perfectly consistent, IMO.

json.dump takes an object and a file, and writes the object to that file. No reason it should rewind. Actually, I see a lot of reasons not to rewind -- user wants to get object length just dumped, user wants to write after the object, f.e. next object or next binary file section...

bqm.to_file is bqm's method that merely returns a "file view" into the bqm's binary representation (according to our serialization format) -- it does not actually write to disk/file, at least according to its API (the fact it might use disk for temporary storage is an implementation detail). I don't see a reason to start at the end. Actually, as your example shows, that would mean every time user wants to read the bqm, they need to rewind!


In light of the issue at hand, I still think naming this method as_file (as initially proposed in #599) would minimize confusion with a dump method (that we might introduce to mean into_file).

randomir avatar Nov 23 '21 13:11 randomir