tableone improve output format

improve output format

Open tompollard opened this issue 5 years ago • 2 comments

The current approach relies on pandas to output the table to tex, csv, etc. Add custom approach to improve the output quality.

Consider tabulate:

from tableone import TableOne
import pandas as pd
import matplotlib.pyplot as plt
import tabulate

url="https://raw.githubusercontent.com/tompollard/tableone/master/data/pn2012_demo.csv"
data=pd.read_csv(url)
overall_table = TableOne(data, label_suffix=True)

x = overall_table.tableone.reset_index()
t = tabulate.tabulate(x, tablefmt="grid", headers=['isnull', 'overall'],showindex=False)
print(t)

+-------------------+------+----------+--------------+
|                   |      | isnull   | overall      |
+===================+======+==========+==============+
| n                 |      |          | 1000         |
+-------------------+------+----------+--------------+
| Age, mean (SD)    |      | 0        | 65.0 (17.2)  |
+-------------------+------+----------+--------------+
| SysABP, mean (SD) |      | 291      | 114.3 (40.2) |
+-------------------+------+----------+--------------+
| Height, mean (SD) |      | 475      | 170.1 (22.1) |
+-------------------+------+----------+--------------+
| Weight, mean (SD) |      | 302      | 82.9 (23.8)  |
+-------------------+------+----------+--------------+
| ICU, n (%)        | CCU  | 0        | 162 (16.2)   |
+-------------------+------+----------+--------------+
| ICU, n (%)        | CSRU |          | 202 (20.2)   |
+-------------------+------+----------+--------------+
| ICU, n (%)        | MICU |          | 380 (38.0)   |
+-------------------+------+----------+--------------+
| ICU, n (%)        | SICU |          | 256 (25.6)   |
+-------------------+------+----------+--------------+
| MechVent, n (%)   | 0    | 0        | 540 (54.0)   |
+-------------------+------+----------+--------------+
| MechVent, n (%)   | 1    |          | 460 (46.0)   |
+-------------------+------+----------+--------------+
| LOS, mean (SD)    |      | 0        | 14.2 (14.2)  |
+-------------------+------+----------+--------------+
| death, n (%)      | 0    | 0        | 864 (86.4)   |
+-------------------+------+----------+--------------+
| death, n (%)      | 1    |          | 136 (13.6)   |
+-------------------+------+----------+--------------+

isdupe = x.duplicated(subset='variable')
x['variable'] = x['variable'].where(~isdupe, '')
t = tabulate.tabulate(x, tablefmt="grid", headers=['isnull', 'overall'],showindex=False)
print(t)

+-------------------+------+----------+--------------+
|                   |      | isnull   | overall      |
+===================+======+==========+==============+
| n                 |      |          | 1000         |
+-------------------+------+----------+--------------+
| Age, mean (SD)    |      | 0        | 65.0 (17.2)  |
+-------------------+------+----------+--------------+
| SysABP, mean (SD) |      | 291      | 114.3 (40.2) |
+-------------------+------+----------+--------------+
| Height, mean (SD) |      | 475      | 170.1 (22.1) |
+-------------------+------+----------+--------------+
| Weight, mean (SD) |      | 302      | 82.9 (23.8)  |
+-------------------+------+----------+--------------+
| ICU, n (%)        | CCU  | 0        | 162 (16.2)   |
+-------------------+------+----------+--------------+
|                   | CSRU |          | 202 (20.2)   |
+-------------------+------+----------+--------------+
|                   | MICU |          | 380 (38.0)   |
+-------------------+------+----------+--------------+
|                   | SICU |          | 256 (25.6)   |
+-------------------+------+----------+--------------+
| MechVent, n (%)   | 0    | 0        | 540 (54.0)   |
+-------------------+------+----------+--------------+
|                   | 1    |          | 460 (46.0)   |
+-------------------+------+----------+--------------+
| LOS, mean (SD)    |      | 0        | 14.2 (14.2)  |
+-------------------+------+----------+--------------+
| death, n (%)      | 0    | 0        | 864 (86.4)   |
+-------------------+------+----------+--------------+
|                   | 1    |          | 136 (13.6)   |
+-------------------+------+----------+--------------+

Nov 13 '18 20:11 tompollard

The tabulate method was added in 0.6.4:

# import libraries
from tableone import TableOne
import pandas as pd

# load sample data into a pandas dataframe
url="https://raw.githubusercontent.com/tompollard/tableone/master/data/pn2012_demo.csv"
data=pd.read_csv(url)

table = TableOne(data, label_suffix=True)
print(overall_table.tabulate(tablefmt = "fancygrid"))

outputs:

╒═══════════════════╤══════╤═══════════╤══════════════╕
│                   │      │ Missing   │ Overall      │
╞═══════════════════╪══════╪═══════════╪══════════════╡
│ n                 │      │           │ 1000         │
├───────────────────┼──────┼───────────┼──────────────┤
│ Age, mean (SD)    │      │ 0         │ 65.0 (17.2)  │
├───────────────────┼──────┼───────────┼──────────────┤
│ SysABP, mean (SD) │      │ 291       │ 114.3 (40.2) │
├───────────────────┼──────┼───────────┼──────────────┤
│ Height, mean (SD) │      │ 475       │ 170.1 (22.1) │
├───────────────────┼──────┼───────────┼──────────────┤
│ Weight, mean (SD) │      │ 302       │ 82.9 (23.8)  │
├───────────────────┼──────┼───────────┼──────────────┤
│ ICU, n (%)        │ CCU  │ 0         │ 162 (16.2)   │
├───────────────────┼──────┼───────────┼──────────────┤
│                   │ CSRU │           │ 202 (20.2)   │
├───────────────────┼──────┼───────────┼──────────────┤
│                   │ MICU │           │ 380 (38.0)   │
├───────────────────┼──────┼───────────┼──────────────┤
│                   │ SICU │           │ 256 (25.6)   │
├───────────────────┼──────┼───────────┼──────────────┤
│ MechVent, n (%)   │ 0    │ 0         │ 540 (54.0)   │
├───────────────────┼──────┼───────────┼──────────────┤
│                   │ 1    │           │ 460 (46.0)   │
├───────────────────┼──────┼───────────┼──────────────┤
│ LOS, mean (SD)    │      │ 0         │ 14.2 (14.2)  │
├───────────────────┼──────┼───────────┼──────────────┤
│ death, n (%)      │ 0    │ 0         │ 864 (86.4)   │
├───────────────────┼──────┼───────────┼──────────────┤
│                   │ 1    │           │ 136 (13.6)   │
╘═══════════════════╧══════╧═══════════╧══════════════╛

Nov 18 '19 04:11 tompollard

It would be good to left align the index columns in the dataframe. See discussion on how to achieve this with styler at: https://github.com/pandas-dev/pandas/issues/39602

Currently the columns are centered when rendered in a notebook, which looks awkward:

Screenshot 2023-05-02 at 3 52 34 PM

May 02 '23 19:05 tompollard

tableone tableone copied to clipboard

improve output format

tableone
tableone copied to clipboard