Fix PositionGroup.fetch_position_info() returning empty DataFrame when merge IDs are non-chronological
PositionGroup.fetch_position_info() returns an empty DataFrame when position merge IDs are fetched in non-chronological order (e.g., alphabetically by UUID rather than by time). Pandas .loc[min:max] slicing on an unsorted index returns empty results.
Changes
-
Core fix: Add
.sort_index()before.loc[]slice infetch_position_info()(line 232) -
Test coverage: Add
test_position_group_non_chronological_order()to verify sorted output
Example
Without the fix, concatenating non-chronological position data fails:
# Sleep session (UUID: 141c4c08-..., time: 10-20s) fetched first
# Run session (UUID: a7585350-..., time: 0-5s) fetched second
# Concatenated index: [10, 15, 20, 0, 2.5, 5]
df.loc[0:20] # Returns empty - index not sorted
df.sort_index().loc[0:20] # Returns all data
The fix ensures correct behavior regardless of fetch order:
position_info = (
pd.concat(position_info, axis=0).sort_index().loc[min_time:max_time]
)
Original prompt
This section details on the original issue you should resolve
<issue_title>PositionGroup.fetch_position_info() breaks if merge ids order does not match chronological order of the positions</issue_title> <issue_description>I'm running into an issue where PositionGroup.fetch_position_info() returns an empty dataframe and have narrowed it down to what I think is a bug at the end of the fetch_position() function.
The two
pos_merge_ids that I'm trying to combine into a position group are as follows: run session 04_r2 (happens first): UUID('a7585350-6b4b-8f46-b4c9-e350ecffc1b6') sleep session 05_s3 (happens second): UUID('141c4c08-ec23-114f-ebbf-0168fc3b2b42')Although I pass these keys into PositionGroup().create_group() via a list of dicts in the above order, they end up getting iterated through in alphabetically-sorted order in PositionGroup.fetch_position() which results in the position_info dataframe containing the sleep position info (chronologically later) first and the run position info (chronologically earlier) second. Because of this, once the dataframes are concatenated and indexed from [min_time: max_time] (see line 231 in spyglass.decoding.v1.core.py), this returns an empty dataframe.
Instead, I think the concatenated dataframe should be sorted by the time index prior to doing the min_time, max_time indexing.
I've replaced the original line 231 with
position_info = pd.concat(position_info, axis=0).sort_index().loc[min_time:max_time]in my own code and that solved the issue for me!</issue_description><agent_instructions>Use TDD. </agent_instructions>
Comments on the Issue (you are @copilot in this section)
@edeno That makes sense. Thank for filing the issue!
- Fixes LorenFrankLab/spyglass#1471
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
Codecov Report
:x: Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 71.11%. Comparing base (7d50c81) to head (d355ab2).
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/spyglass/decoding/v1/sorted_spikes.py | 0.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1475 +/- ##
=======================================
Coverage 71.11% 71.11%
=======================================
Files 114 114
Lines 13307 13307
=======================================
Hits 9463 9463
Misses 3844 3844
| Flag | Coverage Δ | |
|---|---|---|
| full-tests | 71.11% <66.66%> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.