prettytable
prettytable copied to clipboard
Support large tables in output
What steps will reproduce the problem?
1. Generate data (e.g 2.9 million rows)
2. Load into PrettyTable using add_row()
3. Print the table
Example:
table_columns = ['file id', 'parent directory (directory id)', 'file name',
'type', 'extra info']
display_table = PrettyTable(table_columns)
for file_data in file_query(): # this could also be the db_cursor variant
display_table.addrow([file_data[0], file_data[1], file_data[2], file_data[3], file_data[4]])
print(display_table)
What is the expected output? What do you see instead?
expect that the table gets displayed
instead, it crashes on a memory compliant as PrettyTable tries to convert
itself into a one big string
What version of the product are you using? On what operating system?
Linux (Kubuntu 14.10), Python 2.7, prettytable 0.7.2
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 15 Dec 2014 at 5:02
If you were using github or git I'd submit a PR; but since you're using SVN
I've attached an updated copy of the prettytable.py. This version does two
things:
1. Enables my use case above by introducing a new function - print_table() -
that prints the lines to a file (default sys.stdout) instead of building them
into a list.
Instead of:
print(myprettytable)
You do:
myprettytable.print_table()
It also takes a file and end parameter like the print() does so callers can
redirect as desired.
2. Reduces memory significantly by using a generator - my test went from
11-12GB of RAM down to just under 7 GB of RAM usage.
The original get_string() was split into a few more functions to re-use the
code between the get_string() and print_table().
While this version works, and does a great job for the really big tables; it
could be further improved if the formatted data did not have to be saved.
prettytable_alternate.py is an attempt to use more generators to reduce memory.
Indeed it did work - peak was down to just over 5GB, and normal was around
4.7GB - but it also took a lot longer to output the data (it also had to format
the data twice due to the row generator). However, in both cases data is being
outputted earlier than the original implementation since it can be outputted
before all the data is completely built up.
Perhaps you have other ideas on how to speed this all up and reduce memory
consumption for the very large table variants.
Original comment by [email protected]
on 15 Dec 2014 at 11:35
Attachments: