lightkurve icon indicating copy to clipboard operation
lightkurve copied to clipboard

LightCurve.from_pandas() does not return a LightCurve

Open bmorris3 opened this issue 3 years ago β€’ 6 comments

Thanks so much for developing this incredible tool.

Problem description

The current implementation of LightCurve.from_pandas() does not return a LightCurve, but an astropy.timeseries.sampled.TimeSeries object.

Example

import pandas as pd
from astropy.time import Time
import lightkurve as lk

df = pd.read_pickle('data/example.pkl')
# Correct TESS timestamps with JD offset
df.index = pd.DatetimeIndex(
    Time(df.index + 2457000, format='jd').datetime
)

# I had expected this next line to return a LightCurve:
astropy_timeseries = lk.LightCurve.from_pandas(df)

# To get a LightCurve object, you need to call LightCurve a second time:
lc = lk.LightCurve(astropy_timeseries)

Expected behavior

The last line in the example above shouldn't (?) be necessary.

Environment

  • platform (e.g. Linux, OSX, Windows): OS X
  • lightkurve version (e.g. 1.0b6): 2.0.9
  • installation method (e.g. pip, conda, source): pip

bmorris3 avatar May 06 '21 13:05 bmorris3

Hi there! πŸ‘‹ Thank you for opening your first Lightkurve issue! πŸ™ One of our maintainers will get back to you as soon as possible. πŸ‘©β€πŸš€ You can expect a response within 7 days. πŸ“… If you haven’t heard anything by then, feel free to ping this thread. πŸ›ŽοΈ We love that you are using Lightkurve and appreciate your feedback. πŸ‘

github-actions[bot] avatar May 06 '21 13:05 github-actions[bot]

Oops, that's definitely a bug. Thank you for taking the time to report this @bmorris3!

barentsen avatar May 07 '21 16:05 barentsen

Additionally, we should be able to take care of the BTJD conversion, something like the following:

# given a df
lc = lk.TessLightCurve.from_pandas(df)

@bmorris3 In your use case, what is the source of the DataFrame? does it come from a LightCurve.to_pandas() export, or does it come from some other sources?

The reason I'm asking this is conceivably, if the DataFrame comes from LightCurve.to_pandas(), we could possibly further enhance the logic so that from_pandas() can detect its format automatically (without explicitly specifying TessLightCurve)

# given a df
lc = lk.LightCurve.from_pandas(df)  # detect it's in BTJD and returns a TessLightCurve

orionlee avatar May 07 '21 17:05 orionlee

@orionlee That's correct, the dataframe here came from LightCurve.to_pandas(), and this was a "round-trip exercise" which didn't quite work out. It would be super neat if the logic could be improved so my dataframe hacking wasn't necessary!

bmorris3 avatar May 12 '21 22:05 bmorris3

@bmorris3 I wonder if the round-trip exercise is about playing around with the API or has some concrete use cases.

Conceivably, we could make it work by changing LightCurve.to_pandas() so that the resulting dataframe has some additional metadata about it (e.g., it's from TESS data). With the additional metadata, LightCurve.from_pandas() could use it to set the proper time scale and return appropriate LightCurve subclass.

The change would be a tad involved: technically, safely attaching additional metadata to the dataframe would require us to create a special subclass of pandas DataFrame. It can also be argued that the work (attaching metadata) should be at upstream astropy table level.

orionlee avatar May 13 '21 19:05 orionlee

My two cents: it's not too much work to manage the time coordinate manually, and it will likely prevent any accidental misinterpretations of time coordinates in DataFrames, so I don't mind the extra line of code to get the times correct. That said, I don't want to cramp your style, since lightkurve is so incredibly user friendly.

bmorris3 avatar May 17 '21 15:05 bmorris3