amuse icon indicating copy to clipboard operation
amuse copied to clipboard

Writing an array of quantities?

Open rieder opened this issue 8 years ago • 9 comments

If I have an array of quantities y:

print(y)
quantity<[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] kms>

and want to save it to disk, how do I do this while preserving units?

If I save it as a numpy array, it saves the individual elements as quantities, which is not what I want:

np.save('y.npy', y)
y = np.load('y.npy')
print(y)
array([quantity<0.0 kms>, quantity<0.0 kms>, quantity<0.0 kms>,
       quantity<0.0 kms>, quantity<0.0 kms>, quantity<0.0 kms>,
       quantity<0.0 kms>, quantity<0.0 kms>], dtype=object)

This way, the file also becomes much larger than it needs to be...

Is there an Amuse way of saving arrays so that they can be retrieved correctly, without taking much more space than a 'regular' numpy array of scalars?

rieder avatar Feb 19 '18 17:02 rieder

Hi Steven,

The quantities can be pickled, so if you are working with strings;

S = pickle.dumps(y)
print S
x = pickle.loads(S)
print x

You could also use some of the internal function in amuse to make a custom format, if you want I can also make an example for that...

arjenve avatar Feb 22 '18 12:02 arjenve

Hi Arjen, That would be very helpful. Saving pickled data helps restore the array correctly, but the file is unfortunately still much larger than the unit-less file...

rieder avatar Feb 23 '18 13:02 rieder

The most efficient way to store seems to be to save the data unit-less, and then manually re-add the unit. This is exactly what I want to prevent, since it introduces the risk of not using correct units after loading the data...

rieder avatar Feb 23 '18 13:02 rieder

the problem with the pickle seems that a pickled numpy array is quite big compared to raw binary (the unit overhead is small)

ipelupessy avatar Feb 23 '18 19:02 ipelupessy

on the other hand, you can use pickle with binary protocol (1 or 2): S = pickle.dumps(y,1) this is quite compact: 8726 bytes for 1000 double array+units.kms

ipelupessy avatar Feb 23 '18 19:02 ipelupessy

is the solution ok (pickle with binary protocol)? so we can close..

ipelupessy avatar Mar 07 '18 08:03 ipelupessy

I don't think this is solved. The binary protocol makes the file more compact and properly restorable, but it is still

  • a larger difference with the 'plain' array than I would expect and
  • much more complicated than simply doing np.save and np.load operations.

Maybe we should write an Amuse-aware version of these functions...

rieder avatar Mar 09 '18 03:03 rieder

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 365 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 18 '25 21:11 stale[bot]

I don't know if this is a good solution, but what about np.savez? From my test, it seems like the file size is the same as np.save.

quantity = np.random.rand(1000000) | units.kms np.savez('y.npz', number=quantity.number, unit=str(quantity.unit))

Accessing the data:

y = np.load('y.npz') data = y['number'] unit = y['unit']

elkogerville avatar Nov 27 '25 10:11 elkogerville