prql icon indicating copy to clipboard operation
prql copied to clipboard

Modules containing data (tables, relations)

Open eitsupi opened this issue 1 year ago • 3 comments

Sharing data is often cumbersome when creating reproducible examples.

R has some built-in data, so we can access the data immediately by typing in the name of the table (data.frame). For example, we can immediately check the operation of the head() function by typing the following in webR REPL https://webr.r-wasm.org/latest/.

mtcars |> head()

There is a capacity issue, but why not have a module as standard that contains a useful table to illustrate typical operations?


As an aside, I don't think mtcars is a very typical dataset because it contains only numeric types and no missing values. However, I often use it because it is short and the name is easy to remember. I think the palmerpenguins are commonly used datasets these days, but they may be a bit large. https://allisonhorst.github.io/palmerpenguins/

eitsupi avatar Dec 21 '23 23:12 eitsupi

I like the idea a lot!

I guess the examples would have to be very small (like just a few dozen rows), since we would have to create each row with a SELECT statement.

But I think that's still quite useful for examples / bug reports / a common base...

max-sixty avatar Dec 21 '23 23:12 max-sixty

This could be implemented as an external package (see #2491), that would be downloaded by cargo/npm-like-tool, cached and made available within the current project.

This way, we'd avoid including this data in the released compiler binary, but would also have it easily available.

aljazerzen avatar Jan 14 '24 21:01 aljazerzen

This could be implemented as an external package (see #2491), that would be downloaded by cargo/npm-like-tool,

That could be cool, though it's also a decent lift. I would probably vote to push this out until we know more about what packages will look like...

max-sixty avatar Jan 14 '24 23:01 max-sixty