vscode-R icon indicating copy to clipboard operation
vscode-R copied to clipboard

Dynamic dataframe loading when using View()

Open danielbasso opened this issue 4 years ago • 3 comments

Is your feature request related to a problem? Please describe. Using View() on large dataframes causes the system to crash or become slow since it has to load all data into a html template.

Describe the solution you'd like Use pagination to load only rows that are on screen. Filtering and sorting should be handled "server-side".

Describe alternatives you've considered I use things like View(my_dataframe[1:100,]), but it compromises filtering and sorting.

Additional context

Example:

dataframe_size <- 5000000 # 5mi rows

my_dataframe <- data.frame(a = rnorm(dataframe_size),
                 b = rnorm(dataframe_size),
                 c = rnorm(dataframe_size),
                 d = rep("test", dataframe_size))

View(my_dataframe)

Screenshot:

image

I'd like to help/implement, if possible. I already took a look at vsc.R and session.ts since the Ag Grid update (#708). Tips/advice are welcomed.

danielbasso avatar Oct 24 '21 14:10 danielbasso

any update about this? it should be implemented as lazy loading, like RStudio implementation.

litn2018 avatar Dec 19 '21 03:12 litn2018

As a workaround, it is now possible to set the maximum number of rows to be loaded into the data viewer (#945). In my environment, I think setting it to about 100,000 rows is comfortable to use.

image

eitsupi avatar Feb 06 '22 04:02 eitsupi

@eitsupi Had a huge problem with 2FA and almost lost my account. But I'm back.

Thank you for letting me know about the workaround. I've talked to @ElianHugh a while ago about this, and he gave me the idea of making the solution completly optional, to not disturb the current functionality.

I'm trying to find some time to delve into this issue.

danielbasso avatar Mar 25 '22 13:03 danielbasso

Hi, thanks a lot for your huge work on this extension. Vscode complete noob here!

I'm having troubles opening and working interactively with big dataframes (> 1,000,000 rows). The data viewer is quite slow at opening these dataframes and when it does, nothing is shown in the respective preview editor. For small dataframes there are no particular problems.

I have enabled R:Session watcher and set the option Row limit to 50

Do you have any update about this feature/fix?

Thanks!

d-golzato avatar Jan 02 '23 16:01 d-golzato

Hi @d-golzato, how are you?

Unfortunately I couldn't delve further into this feature. Life got a little too busy since my first (and only) contribution to this project so far.

The main problem is that R commands can trigger Vscode events/actions/etc via writing on a file that is watched by the Vscode session; the other way around, as far as I know, it's not possible in the moment.

The only way to trigger R events/actions in the session is quite literally writing in the R console. I've managed to create a very crude solution some time ago that worked as follows:

  1. Ag grid requests a new page;
  2. Vscode, through the rTerminal 'class', literally writes in the console a command bound to slice a new chunk of data from the original dataframe;
  3. The result is written on a file, watched by Ag Grid, who them proceeds to update the frame with new data.

It worked, but the constant terminal calls were ugly and the overall approach felt cheap. I couldn't manage to find a elegant way to make the R calls invisible. Furthermore, I don't think that is the optimal approach to the problem: an websocket communication, as described in #1151, could provide nicer interactions between R and Vscode sessions and allow lots of new features.

Sorry about that.

danielbasso avatar Jan 04 '23 17:01 danielbasso

This issue is stale because it has been open for 365 days with no activity.

github-actions[bot] avatar Jan 05 '24 01:01 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jan 19 '24 01:01 github-actions[bot]