root_pandas icon indicating copy to clipboard operation
root_pandas copied to clipboard

Flattening two branches produces only one `__array_index`

Open natfarleydev opened this issue 8 years ago • 5 comments

I have a tree with DecayTreeFitter and when I retrieve two array variables and use the flatten=True option with the read_root function, the resulting dataframe only contains a single __array_index column, so it's not possible to tell which index belongs to which column.

Side note: Addressing #28 would fix this, I may have a look into implementing it.

natfarleydev avatar Aug 19 '16 10:08 natfarleydev

From examining the tests, it looks like using flatten=True is deprecated (please correct me if I'm wrong) and should produce a warning.

For some reason the warning does not come up for me.

natfarleydev avatar Aug 19 '16 10:08 natfarleydev

Only one __array_index is produced at the moment, because we only support flattening arrays with the same length, in which case __array_index is valid for both variables at the same time. It's strange that it doesn't give you a warning. Pinging @KonstantinSchubert

I should also update the documentation to reflect the changed API.

ibab avatar Aug 19 '16 10:08 ibab

I will have a look tonight.

KonstantinSchubert avatar Aug 19 '16 10:08 KonstantinSchubert

After looking at the code, it looks like the warning is not in the current pip version. Should be fixed as soon as the new version is pushed :)

Re: __array index flattening, I think docs is enough to fix this problem for now, but #28 would be a much nicer way to fix it imo.

natfarleydev avatar Aug 19 '16 10:08 natfarleydev

@nasfarley88: The functionality suggested in #28 doesn't really overlap with the flatten kwarg, because flatten is mainly designed to work with variable-length arrays, which #28 can't handle.

Btw, in the current pip version, we're still using flatten=True/False, so the documentation is currently up to date for that. But we should be more clear about what happens when you are flattening multiple arrays. I'll also make a new release soon, because the list argument to flatten fixes several important issues, like the fact that flatten=True/False is too greedy and tries to flatten all arrays, regardless whether they are compatible in length, which leads to an error.

ibab avatar Aug 19 '16 11:08 ibab

As explicitly written in the README since a while, root_pandas, and root_numpy on which it depends, has been deprecated and effectively unmaintained for quite a while. We decided to close anthing outstanding as "won't do" and archive the package at this point.

eduardo-rodrigues avatar Jan 09 '23 09:01 eduardo-rodrigues