GH-38916: [R] Simplify dataset and table print output
Rationale for this change
When printing objects with data with lots of rows, the output is long and unwieldy.
What changes are included in this PR?
- Truncates long schema print output and adds the number of columns to dataset print output.
- Add number of columns to output so it's clear how many there are in total
Are these changes tested?
Yes
Are there any user-facing changes?
Yes
Before:
library(arrow)
x <- tibble::tibble(!!!letters, .rows = 5)
InMemoryDataset$create(x)
#> InMemoryDataset
#> "a": string
#> "b": string
#> "c": string
#> "d": string
#> "e": string
#> "f": string
#> "g": string
#> "h": string
#> "i": string
#> "j": string
#> "k": string
#> "l": string
#> "m": string
#> "n": string
#> "o": string
#> "p": string
#> "q": string
#> "r": string
#> "s": string
#> "t": string
#> "u": string
#> "v": string
#> "w": string
#> "x": string
#> "y": string
#> "z": string
arrow_table(x)
#> Table
#> 5 rows x 26 columns
#> $"a" <string>
#> $"b" <string>
#> $"c" <string>
#> $"d" <string>
#> $"e" <string>
#> $"f" <string>
#> $"g" <string>
#> $"h" <string>
#> $"i" <string>
#> $"j" <string>
#> $"k" <string>
#> $"l" <string>
#> $"m" <string>
#> $"n" <string>
#> $"o" <string>
#> $"p" <string>
#> $"q" <string>
#> $"r" <string>
#> $"s" <string>
#> $"t" <string>
#> $"u" <string>
#> $"v" <string>
#> $"w" <string>
#> $"x" <string>
#> $"y" <string>
#> $"z" <string>
record_batch(x)
#> RecordBatch
#> 5 rows x 26 columns
#> $"a" <string>
#> $"b" <string>
#> $"c" <string>
#> $"d" <string>
#> $"e" <string>
#> $"f" <string>
#> $"g" <string>
#> $"h" <string>
#> $"i" <string>
#> $"j" <string>
#> $"k" <string>
#> $"l" <string>
#> $"m" <string>
#> $"n" <string>
#> $"o" <string>
#> $"p" <string>
#> $"q" <string>
#> $"r" <string>
#> $"s" <string>
#> $"t" <string>
#> $"u" <string>
#> $"v" <string>
#> $"w" <string>
#> $"x" <string>
#> $"y" <string>
#> $"z" <string>
After:
library(arrow)
x <- tibble::tibble(!!!letters, .rows = 5)
InMemoryDataset$create(x)
#> InMemoryDataset
#> 26 columns
#> "a": string
#> "b": string
#> "c": string
#> "d": string
#> "e": string
#> "f": string
#> "g": string
#> "h": string
#> "i": string
#> "j": string
#> "k": string
#> "l": string
#> "m": string
#> "n": string
#> "o": string
#> "p": string
#> "q": string
#> "r": string
#> "s": string
#> "t": string
#> ...
#> Use `schema()` to see entire schema
arrow_table(x)
#> Table
#> 5 rows x 26 columns
#> $"a" <string>
#> $"b" <string>
#> $"c" <string>
#> $"d" <string>
#> $"e" <string>
#> $"f" <string>
#> $"g" <string>
#> $"h" <string>
#> $"i" <string>
#> $"j" <string>
#> $"k" <string>
#> $"l" <string>
#> $"m" <string>
#> $"n" <string>
#> $"o" <string>
#> $"p" <string>
#> $"q" <string>
#> $"r" <string>
#> $"s" <string>
#> $"t" <string>
#> ...
#> Use `schema()` to see entire schema
record_batch(x)
#> RecordBatch
#> 5 rows x 26 columns
#> $"a" <string>
#> $"b" <string>
#> $"c" <string>
#> $"d" <string>
#> $"e" <string>
#> $"f" <string>
#> $"g" <string>
#> $"h" <string>
#> $"i" <string>
#> $"j" <string>
#> $"k" <string>
#> $"l" <string>
#> $"m" <string>
#> $"n" <string>
#> $"o" <string>
#> $"p" <string>
#> $"q" <string>
#> $"r" <string>
#> $"s" <string>
#> $"t" <string>
#> ...
#> Use `schema()` to see entire schema
- Closes: #38916
:warning: GitHub issue #38916 has been automatically assigned in GitHub to PR creator.
I see this still has some tests failing...just making sure you weren't waiting on any of us for a review!
I see this still has some tests failing...just making sure you weren't waiting on any of us for a review!
Nah, life has just been busy, but thanks for checking in on it. I'll ping you once I get time to take a look at it again!
@paleolimbot This is now ready for review; mind giving this a look over when you get the chance?
Thanks @thisisnic. I see this as a nice improvement and don't see any issues with it. I left a few notes.
Thanks @jonkeane, I agree we should take a look at #32110 soon, but merge this as an interim thing in the meantime.
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit ac1708ce65e15a87fadec39e761731b3d916fb19.
There was 1 benchmark result with an error:
- Commit Run on
ursa-i9-9960xat 2024-03-14 05:40:53Z
There was 1 benchmark result indicating a performance regression:
- Commit Run on
ursa-i9-9960xat 2024-03-14 05:40:53Z
The full Conbench report has more details. It also includes information about 19 possible false positives for unstable benchmarks that are known to sometimes produce them.