datatable
datatable copied to clipboard
[FR]: More display options
Would it possible to include another option to control the number of columns that are printed to the console, just like it is already possible for rows?
dt.options.display.max_nrows = 1000 # head_nrows, tail_nrows, max_ncols, left_ncols, right_ncols.
Thank you very much for your consideration.
Currently, the display is limited by the width of the terminal window. Trying to show more columns than that will result in a wrap-around into multiple lines, making it unreadable.
In view of this, what exactly your suggested options should do?
These options would specify how many columns are printed. Hopefully without wrap-around into multiple lines.
max_ncols: maximum number of columns left_ncols: number of columns from the left side of the df right_ncols: number of columns from the right side of the df
example:
# df with 100 columns
dt.options.display.left_ncols = 30
dt.options.display.right_ncols = 30
# would print the first 30 columns and the last 30 columns, with the 40 columns in the middle not printed
If the width is bigger than the terminal window it should print it in a way that I can scroll through horizontally. I hope this makes it a bit clearer.
If the width is bigger than the terminal window it should print it in a way that I can scroll through horizontally.
That's the thing, terminals don't support horizontal scrolling. So any line of text that's longer than the terminal width will get wrapped around.
I am working with PyCharm and at least the 'Python Console' supports horizontal scrolling.
https://user-images.githubusercontent.com/67552385/124980813-23983c00-e035-11eb-9213-bdcda27977d0.mov
I need this option too. I'm using Jupyter, which support horizontal scrolling. Now I usually use to_pandas() to display a wide table, with pandas' option: panas.option.display.max_columns=99. But it's really slow at to_pandas() when analyzing hundred-millions of rows.
We don't have a specific PyCharm rendering, so this should be added to distinguish PyCharm from let's say a terminal. As for Jupyter, there is already some support, so in principle it should not be complicated to allow an arbitrary number of columns to be displayed.
import os
if 'PYCHARM_HOSTED' in os.environ:
print("running in PyCharm")
else:
print('normal script')
Yes, this check should be added to C++ and the new option should only work for a specific set of terminals, that don’t wrap around. We should also test frame rendering in PyCharm in general, because I don’t think we ever did that.
@oleksiyskononenko, here is an example for PyCharm:
from datatable import dt, f, by
df = dt.Frame(A=['some string'] * 10, B=['another string'] * 10)
df_wide = dt.Frame()
for x in range(6):
df_wide = dt.cbind(df_wide, df)
Printing to console:
data:image/s3,"s3://crabby-images/b28ea/b28ea207ae56103c762e3e71382506f80f3b4854" alt="Screenshot 2021-09-08 at 20 40 34"
Using the 'view' functionality:
data:image/s3,"s3://crabby-images/87756/8775668e5831d314ad1dc7b63ac674f609a0b07e" alt="Screenshot 2021-09-08 at 20 41 24"
data:image/s3,"s3://crabby-images/cbe70/cbe70f11aa3f1ac1bb20c5f6e64e7d5754720c4a" alt="Screenshot 2021-09-08 at 20 42 44"
Datatable: 1.0.0 Python: 3.8.10 PyCharm: 2021.2.1 (professional) MacOS: 11.5.2
I hope that helps a bit. You can see that in the console fewer than the total number of columns are shown, while the 'view' window displays the table with wrapping.
@Peter-Pasta Thanks, yes, we definitely never had a comprehensive test of frame rendering under PyCharm. Though what you show I can see in a standard terminal too. Looks like a bug when dealing with string columns. Should normally render as: first columns, ellipsis, last columns.