positron icon indicating copy to clipboard operation
positron copied to clipboard

Assistant: When Data Viewer is active, include its data and metadata as context

Open jmcphers opened this issue 5 months ago • 3 comments

When a text file is open in the active editor, it is automatically included as context in Assistant.

Image

However, if the Data Viewer is the active editor, nothing is automatically included as context.

Image

This makes it difficult to ask Assistant about the data that's right in front of you. When a data viewer is the active editor, we could attach an implicit context object (as we do for files and consoles). This object could contain:

  • information for the model about the shape of the data
  • some of the data itself
  • if the data is in an R or Python session, the identifier of the session and the variable name
  • if the data is from a connection, information about the connection
  • if the data is from a CSV file, the location of the CSV file and perhaps a sample thereof
  • (and so on, i.e. the provenance of the data)
  • enough information to perform tool calls to get more more complete data (e.g. getTableSummary)

jmcphers avatar Jul 29 '25 15:07 jmcphers

This could be an extension of the same idea for R/Python like: https://github.com/posit-dev/positron/issues/8343 via Data Explorer's profiling/schema tools since we won't have it in-memory like R/Python.

jthomasmock avatar Jul 29 '25 15:07 jthomasmock

When we construct context:

  • check if focused/active editor is a Data Explorer
  • if it is, tell Assistant that data is relevant (may want to call table summary tool)
  • make it visible to the user that we are including implicit file context (similar to attaching a code file, user can opt-out if preferred)

sharon-wang avatar Sep 02 '25 17:09 sharon-wang

@jmcphers +1 - I'd be careful with the handling of tokens though, because when I ask questions about the CSV file to Assistant, I thankfully get an error saying I'd use too many tokens. I had a 20k rows data set and asked what was the largest value. Not a good use case example, since we can just filter by largest in this case, but I'm glad the Assistant pointed out it'd use too many tokens.

rodrigosf672 avatar Oct 01 '25 20:10 rodrigosf672