reciprocalspaceship
reciprocalspaceship copied to clipboard
`stack_anomalous` inside `groupby` breaks `as_index=False`
When called inside a groupby.apply
context, stack_anomalous
overrides the as_index=False
setting and appends the grouping column to the index of the returned dataset with name None
import numpy as np
import reciprocalspaceship as rs
cell = [34., 45., 98., 90., 90., 90.]
spacegroup = 19
dmin = 4.
repeats = 1
h,k,l = rs.utils.generate_reciprocal_asu(cell, spacegroup, dmin, anomalous=True).T
ds = None
for i in range(repeats):
_ds = rs.DataSet({
"H" : h,
"K" : k,
"L" : l,
"I" : np.random.random(len(h)),
"SIGI" : np.random.random(len(h)),
}, cell=cell, spacegroup=spacegroup, merged=True).infer_mtz_dtypes()
_ds['repeat'] = i
if ds is not None:
ds = rs.concat((ds, _ds))
else:
ds = _ds
ds = ds.set_index(['H', 'K', 'L'])
print(f"Before: {ds.index}")
# Somehow calling `stack_anomalous` overides `as_index=False`
result = ds.groupby('repeat', as_index=False).apply(lambda x: x.stack_anomalous())
print(f"After: {result.index}")
which gives the following output:
Before: MultiIndex([(-8, -3, -5),
(-8, -3, -4),
(-8, -3, -3),
(-8, -3, -2),
(-8, -3, -1),
(-8, -2, -7),
(-8, -2, -6),
(-8, -2, -5),
(-8, -2, -4),
(-8, -2, -3),
...
( 8, 2, 4),
( 8, 2, 5),
( 8, 2, 6),
( 8, 2, 7),
( 8, 3, 0),
( 8, 3, 1),
( 8, 3, 2),
( 8, 3, 3),
( 8, 3, 4),
( 8, 3, 5)],
names=['H', 'K', 'L'], length=24470)
After: MultiIndex([(0, -8, -3, -5),
(0, -8, -3, -4),
(0, -8, -3, -3),
(0, -8, -3, -2),
(0, -8, -3, -1),
(0, -8, -2, -7),
(0, -8, -2, -6),
(0, -8, -2, -5),
(0, -8, -2, -4),
(0, -8, -2, -3),
...
(9, -8, -2, -3),
(9, -8, -2, -4),
(9, -8, -2, -5),
(9, -8, -2, -6),
(9, -8, -2, -7),
(9, -8, -3, -1),
(9, -8, -3, -2),
(9, -8, -3, -3),
(9, -8, -3, -4),
(9, -8, -3, -5)],
names=[None, 'H', 'K', 'L'], length=44610)
The repeat
column still persists in the result
dataset.
is this a bug? maybe this is just what pandas does when groupby.apply
returns a different length dataframe?