pynapple
pynapple copied to clipboard
Mapping time arrays on each other
Hi, I am adopting the very convenient Tsd class in my code. I have this issue I encounter often when synching data: using a synching signal to map times from one set of data to the other.
Example: I digitise a 1Hz TTL pulse from two devices: my ephys recorder (30k Hz, interval t0-t1), and have results in the Tsd object TsdEphysSynchTTL
; and my experiment log signals recorder (1kHz, interval t2-t3), having the results in the Tsd object TsdIntanSynchTTL
. In principle knowing the sampling rates should be enough, in practice 2h long experiments are enough for the series to diverge up to couple hundreds ms by the end of the acquisition. So, I find TTLs in the two series, and I would like to use them to map event times from one timebase onto the other.
Is there already something like this in the project? I could not find it. I have reasons for preferring keeping this synch happening flexibly at runtime instead of solving synching once at the beginning and resaving data with an homogenous timebase, which would be the obvious alternative.
If there is not, there could be something like a SynchTs
subclass containing synching events that could implement this functionality, with a map_to()
method that could be used to map data to an other SynchTs
object provided that the two objects contain mappable events (eg, testing if there's same number of things and there's a reasonable linear fit between the two). If this could be useful, happy to contribute the feature
Hi Luigi, This sounds like a classic time alignment problem. Basically, if you have say acquisition systems 1 and 2, what you want is a an array of time values for the system 2 that is expressed in the same time reference as the 1st system. You can do this knowing a common clock indeed. Do we agree that the pseudo-code below is the kind of processing you do?
Assuming you have N1 and N2 samples from the two systems, acquired at sampling frequency f1 and f2 You can define two time value arrays: t1 = np.linspace(0, N1-1, 1/f1) t2 = np.linspace(0, N2-1, 1/f2) Assuming also that you have (hopefully) detected the same number of TTLs on the two systems, at times TTL1 and TTL2 (in the time reference of each system, using t1 and t2) then, you can simply do an interpolation: t2_1 = np.interp(TTL2,TTL1,t2) (note that it might be necessary to restrict t2 from the first to the last TTL times to prevent any erroneous interpolation beyond the bounds of the TTL data) t2_1 is the time of the 2nd system expressed in the time reference of the first one.
disclaimer, I'm writing this without testing it...
Now, the question is whether we should implement in Pynapple with map_to() as you suggest. That's an interesting idea. Let's discuss!
Indeed, very simple problem, that I end up solving a million times over again in a million places :D
Looking at it your pseudocode is exactly what I had in mind. There can be some minor refinements for whether eg events at the beginning/end should be disregarded if found in only one of the two arrays; or whether events have identities that could be matched (in my lab we use numerical barcodes instead of simple pulses); but those are just bells and whistles.
In terms of implementation / interface, this is the idea I would pitch:
A "timebase" A is represented by a SynchTs
object. I imagine the bare-bones of the SynchTs
object being something like this:
class SynchTs(Ts):
def __init__(self, *args, **kwargs):
... # nothing crucial
def _find_map_to_timebase(self, other_timebase):
# your pseudocode here, returning the interpolation coefficients, in case a new time interval to restrict signal
def map_to_other_timebase(self, other_timebase, any_ts_signal):
# map an arbitrary Ts, Tsd, or TsGroup to the new timebase by applying the interpolation to times (optionally cached)
With this, if I have a timebase A-described by TTL events set SynchTsA
, and a second timebase B described by SynchTsB
, and I want to map the signal TsdA
from timebase A to timebase B, I would have to write:
TsdB = SynchTsA.map_to_other_timebase(SynchTsB, TsdA)
This feels very natural to me, as I have been seeing this syntax working with anatomical spaces registration - but maybe you have other takes on it? Not sure how it feels for others!
Another option could be to allow for a Ts object to be endowed with synch events, to be provided in the initialisation. In this way, the transformation could happen under the hood mapping directly one Ts to the other. In this case, the interface would be:
cool_ts_obj_a = Ts(..., synch_events_timebase1)
cool_ts_obj_b = Ts(..., synch_events_timebase2)
# This should fail if there's no synching events available in both Tss:
cool_ts_obj_a_in_timebase_of_b = cool_ts_obj_a.map_to(cool_ts_obj_b)
Even in this second case, the code for the transformation could be kept in a specialised SynchTs
obj not to overload the Ts class.
Maybe I'm missing something, but isn't this implemented with .value_from()?
Hi @dlevenstein, thanks for joining in! Actually no, as far as my understanding of value_from()
goes: that function assumes that the two time series already share the same timebase (t=0 or t=1000 means the same for both); the case I am mentioning is when you have not-synched acquisitions, where the starting point might be different or the sampling accuracy could make the timebases slowly diverge over time. From what I understand value_from()
assumes that what is at t=100 in one time series is very close to what is at t=100.0001 in the other, which is not the case in what I am describing.
Imagine digitising two signals on two different programs, started at slightly different times, with one of the two programs counting time in a slightly unreliable way. You end up with two signals that are in different timebases, and the only way to match them is by acquiring synching signals in both timebases - eg, a TTL clock. You then need to compare the times of occurrence of those events in both timebases, and interpolate the times (which will have an offset, but also a slightly different coefficient if your program/acquisition device clock is not perfect). Does this make sense?
Ah got it - for some reason I thought value_from() had the option to interpolate to the nearest timestamp. I know we had discussed this at one point but guess it never made it into the code.
Could this be solved with adding an 'interpolate' option to value_from()?
jk, you still need the aligned timebase with the TTL pulse. Got it 😅
my only concern is to mix up objects having different time base. It's a slippery slope IMO. Pineapple assumes that all times are in the same time base. Basically, if time is re-referenced, it should be done for all objects in the workspace... One solution would to integrate this feature in the I/O instead, like a low-level wrapper loading data and re-referencing times. Something similar exists already when loading position files. We could imagine to generalize this feature. What do you guys think?
I see your point and I would understand that decision - but in that case maybe would not cover my usage, as I mentioned I have reasons not to re-reference everything only in one place at the beginning.
ATM I am using a lot the time classes independently of the rest of pyneapple, that's the angle I was coming from! Was about to write a little other package to deal with signals with this feature, and as it would have imported time classes from here, I thought it might as well be here. But I understand your point, could be a bit out of scope. Let me know!
It seems to me like a small package on its own. Interpolation is something missing in pynapple and I was thinking of wrapping np.interp with something like this
def linear_interpolate(self, ts, left=None, right=None):
"""Wrapper of the numpy interp.
See https://numpy.org/doc/stable/reference/generated/numpy.interp.html
for an explanation of the parameters.
The argument ts should be Ts, Tsd, TsdFrame to ensure interpolating from
sorted timestamps in the right unit,
Parameters
----------
ts : Ts, Tsd or TsdFrame
The object holding the timestamps
left : None, optional
Value to return for ts < tsd[0], default is tsd[0].
right : None, optional
Value to return for ts > tsd[-1], default is tsd[-1].
"""
if not isinstance(ts, (Ts, Tsd, TsdFrame)):
raise RuntimeError("First argument should be an instance of Ts, Tsd or TsdFrame")
y = np.interp(ts.index.values, self.index.values, self.values)
return Tsd(t=ts.index.values, d=y, time_support = self.time_support)
Would this be enough if you were to subclass Tsd in your own time mapping package?
Not really! From what I get your interpolation assumes that the two signals share the two timebase; by timebase, I do not mean the index of the object; I mean the fact that the value 0 s and the value 100 s means the same in the two objects, which is not the case in my situation.
My scenario has 3 different signals:
- sync signal SynchTs in timebase A, with N detected events
- same synch signal SynchTs in timebase B (eg acquired by other acquisition sustem), with N detected events
- data signal SynchTs(d) in timebase B, with n_datapoints datapoints
From which I want to find eg. SynchTs in timebase A
I need to use 1 and 2 to find a mapping between timebases (an offset and a coefficient in a linear mapping); and then apply the mapping to the time array of the signal. Below, the first iteration of the code that I ended up implementing; if you want to include it anywhere, I can open a PR, otherwise I'll just keep it in my package:
import numpy as np
from pynapple import Ts, Tsd, TsGroup
class TimeBaseTs(Ts):
"""TimeBase class for mapping signals between timebases.
The data that should populate this object is detected events from a clock
signal that gets acquired by multiple acquitision devices, eg device A and B.
The synch events can then be used to move data acquired in one of the two systems
to the timebase of the other.
"""
def _map_to(self, other_timebase: "TimeBaseTs") -> tuple[float, float]:
# TODO maybe get intersection first
# TODO deal with time support
# linear regression using least squares in numpy:
coef, offset = np.polyfit(self.index, other_timebase.index, 1)
return coef, offset
def map_times_to(
self, other_timebase: "TimeBaseTs", times: np.ndarray
) -> int | np.ndarray:
"""Core index conversion function"""
coef, off = self._map_to(other_timebase)
return (times * coef) + off
def map_ts_to(self, other_timebase: "TimeBaseTs", tsd_obj: "Ts") -> "Ts":
"""Map a Ts object to another timebase."""
new_tsd_obj = tsd_obj.copy()
new_index = self.map_times_to(other_timebase, new_tsd_obj.index)
new_tsd_obj.index = new_index
return new_tsd_obj
if __name__ == "__main__":
import numpy as np
test_signal_a = TimeBaseTs(np.array([0, 1, 2, 3]))
test_signal_b = TimeBaseTs(np.array([1.5, 2, 2.5, 3]))
new_timesig_times = np.arange(2, 3, 0.1)
signal_timebase_a = Tsd(new_timesig_times, np.random.rand(len(new_timesig_times)))
# map signal from one timebase to the other:
test_signal_a.map_ts_to(test_signal_b, signal_timebase_a)
By the way, have you considered packaging the Tsd and related products in a separate smaller repo? I would see a point there, their utility goes definitively beyond the rest of the analyses that pineapple implements!
Ok I understand now. This assume that there is a near-constant jitter being added and also that you will detect the same number of events. I think it could be included in fact. What about something like this directly into the Tsd class :
def map_linearly_ts(self, origin_ts, target_ts) -> Tsd:
# Check len(origin_ts) == len(target_ts)
# Do the linear regression
# Create a new_tsd with new timestamps
return new_tsd
And your mapping becomes :
signal_timebase_b = signal_timebase_a.map_linearly_ts(test_signal_a, test_signal_b)
I changed my whole pipeline to do the time mapping before creation of the NWB file, so I personally won't use this feature. Still happy to contribute it in the future if anyone thinks it might be useful!