codex icon indicating copy to clipboard operation
codex copied to clipboard

Fix: Show in-progress coalesced tool calls in the transcript overlay

Open Chriss4123 opened this issue 2 weeks ago • 3 comments

Fixes #7998

Problem

The TUI coalesces “exploring” commands (reads/searches/lists) into an in-flight active_cell so the main view shows a single grouped • Exploring / • Explored block. That cell is not committed into App::transcript_cells until a later flush boundary, but the Ctrl+T transcript overlay was rendered only from App::transcript_cells.

Result: mid-turn, transcript view can appear to drop the “Explored” tool calls until something later forces a flush.

pr-issue-1 pr-issue-2

What changed

Correctness: render the in-flight cell in the transcript overlay

  • Expose the current in-flight history cell’s transcript render identity + output:
    • ChatWidget::active_cell_transcript_key() returns a cheap key describing the current active_cell transcript tail (revision, is_stream_continuation, and optional animation metadata).
    • ChatWidget::active_cell_transcript_lines(width) returns the Vec<Line> transcript lines for the current active_cell.
  • Add a render-only “live tail” to the transcript overlay:
    • TranscriptOverlay::sync_live_tail(...) appends the active cell’s transcript lines after committed transcript cells, matching existing blank-line separation rules and preserving “pinned to bottom” scrolling when applicable.
  • Keep the live tail synchronized while the overlay is open:
    • On each TuiEvent::Draw while Overlay::Transcript is active, App::overlay_forward_event syncs the live tail from the current active_cell.

Efficiency + side-effect fixes

  • Avoid recomputing expensive tail transcript lines on every redraw:
    • ChatWidget maintains an active_cell_revision counter that bumps when the active cell changes in ways that affect transcript output.
    • TranscriptOverlay caches the live tail’s already-rendered Line list and only recomputes when the cache key changes (revision/width/animation tick).
  • Preserve animated in-flight visuals under caching (no “frozen spinner/shimmer”):
    • HistoryCell::transcript_animation() allows a cell to opt into time-based transcript updates while in-flight.
    • The live tail cache key incorporates the current animation tick when present, and the overlay schedules the next frame only when the tail is visible.
  • Fix the one-frame wrapping mismatch on terminal resize:
    • Tui::screen_width_hint() provides an up-to-date width hint (updated on resize events and during draw) so the live tail wraps correctly on the first post-resize frame.

Parity + tests

  • Implemented for both codex-rs/tui and codex-rs/tui2 to keep feature parity.
  • Added tests (including snapshot coverage) for live-tail rendering, separator rules, caching no-ops, visibility tracking, and animation tick invalidation behavior.
pr-fixed-1 pr-fixed-2

https://github.com/user-attachments/assets/0ec2b585-34f5-41b4-a309-63a2c89bce71

Why this is the best approach

  • Preserves the intentional “exploring” coalescing behavior (no change to when/why exec groups are flushed).
  • Avoids correctness hazards from forcing a flush while a command is in-flight (which can split one logical exec group across multiple history cells).
  • Keeps backtrack/highlighting semantics stable by not injecting the live tail into transcript_cells (it’s render-only).
  • Avoids a redraw-time performance cliff: scrolling/key-repeat redraws don’t repeatedly re-highlight/rewrap the active exec transcript unless the underlying tail actually changes.

Testing

  • cargo test -p codex-tui
  • cargo test -p codex-tui2

Chriss4123 avatar Dec 14 '25 03:12 Chriss4123

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

github-actions[bot] avatar Dec 14 '25 03:12 github-actions[bot]

I have read the CLA Document and I hereby sign the CLA

Chriss4123 avatar Dec 14 '25 03:12 Chriss4123

@codex review

Chriss4123 avatar Dec 16 '25 03:12 Chriss4123

Thanks for the contribution. The change you've proposed here affects some tricky parts of the codex logic, and there's a higher-than-normal regression risk with a change like this. That means we need to apply a high bar for a change like this.

The PR is quite large and touches a lot of files. I'm guessing that it was AI generated because it doesn't seem to be based on a deep understanding of the code. For those reasons, I'm going to pass on this fix.

If you'd like to contribute a more surgical fix, we'd be willing to review that.

etraut-openai avatar Dec 17 '25 19:12 etraut-openai

Thanks for the contribution. The change you've proposed here affects some tricky parts of the codex logic, and there's a higher-than-normal regression risk with a change like this. That means we need to apply a high bar for a change like this.

The PR is quite large and touches a lot of files. I'm guessing that it was AI generated because it doesn't seem to be based on a deep understanding of the code. For those reasons, I'm going to pass on this fix.

If you'd like to contribute a more surgical fix, we'd be willing to review that.

@etraut-openai really appreciate the feedback.

First: "this looks AI generated". For transparency: ~55-60% of this commit was generated with GPT‑5.2 (xhigh) via Codex. Obviously it sucks when you spend a lot of time on a commit for it to be called AI slop but I completely understand with the current times we're living in it's not worth it to waste your time reviewing potential AI code (I deal with these exact same issues).

I did spend a lot of time reading the architecture and planning (around five days in total). So I'm really sorry my code wasn't up to standard, there must have been stuff I missed, especially with the huge monolith TUI files.

Part of the PR being so big was me trying to be a perfectionist. My original fix in 9a60868c1372bdcb80aa45d83b12aed1e0c3a273 was pretty surgical but it had some side-effects.

That naive approach recomputed and rebuilt the tail on every Draw tick including scroll/page navigation, highlight updates, etc. That means re-running wrapping + highlighting work for the active exec cell even when nothing changed. On my machine it was unnoticeable, but it’s still wasted work in the hottest interactive loop.

Then I got a bit carried away. For example, implementing optimizations to not render animations when off screen, fixing a one frame render mismatch during resizing, among others. I failed to realize I was trying to be hyper-efficient and perfect in an imperfect world and that's what blew it up.

I still believe in my code's quality via countless re-reads and cross-checking by both GPT-5.2 xhigh and GPT-5.2 Pro. However, I completely understand and agree with your judgement that it's doing too much and touching too much surface area.

I went back and greatly improved this PRs surgicularity and fixed all of the necessary performance side-effects like unnecessary re-renders and the one-frame mismatch while touching way less of the codebase and keeping the fixes as constrained and surgical as possible.

Once again, I've analyzed and re-read this new implementation countless times and cross-checked it via GPT-5.2 xhigh and GPT-5.2 Pro numerous times in a ruthless manner.

Again, I apologize for wasting your time with this current PR. I've submitted a new one here that I think you'd enjoy much more.

If you have the time please take a look.

Chriss4123 avatar Dec 18 '25 07:12 Chriss4123

Thanks for taking another swing at this. The new PR does look better.

As I mentioned, this is a sensitive part of the code — one that has a history of regressions, so any change here is going to require significant review and testing. The devs on the team who are most familiar with this code are busy with high-priority work right now, so it might be some time before we have a chance to review your new PR. Just wanted to set expectations.

etraut-openai avatar Dec 18 '25 18:12 etraut-openai

Thanks for taking another swing at this. The new PR does look better.

As I mentioned, this is a sensitive part of the code — one that has a history of regressions, so any change here is going to require significant review and testing. The devs on the team who are most familiar with this code are busy with high-priority work right now, so it might be some time before we have a chance to review your new PR. Just wanted to set expectations.

No problem waiting. This isn't really an issue in a blocking sense, but to me it's just really sick to see all the commands the agent is running in realtime. Thanks once again for your feedback.

Chriss4123 avatar Dec 18 '25 22:12 Chriss4123