Incorrect Decimal Precision in StatisticalResult.print_summary (No Style)
The print_summary method in the lifelines.statistics.StatisticalResult class was not correctly handling the decimals argument when no explicit style was provided. This led to the output table displaying the default precision of 2 decimals, even if a different value was specified.
- Problem: the value of the 'decimals' argument did not propagate properly in the class.
- Solution: set
self.decimalsso that the value is shared properly by all the methods of the class. - Impact: this fix ensures that the desired decimal precision is respected in the output table, even when no specific style is chosen.
For example, the issue was reproduced using the provided example with the results of the logrank_test without explicit 'style' argument, resulting in a incorrect table with a precision of 2 decimals (the default) instead of 10:
from lifelines import statistics as stats
from lifelines.datasets import load_rossi
rossi = load_rossi()
results = stats.logrank_test(
durations_A=rossi.loc[rossi['fin']==0, 'week'],
durations_B=rossi.loc[rossi['fin']==1,'week'],
event_observed_A=rossi.loc[rossi['fin']==0, 'arrest'],
event_observed_B=rossi.loc[rossi['fin']==1,'arrest'],
)
results.print_summary(decimals=10)
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<tbody>
<tr>
<th>t_0</th>
<td>-1</td>
</tr>
<tr>
<th>null_distribution</th>
<td>chi squared</td>
</tr>
<tr>
<th>degrees_of_freedom</th>
<td>1</td>
</tr>
<tr>
<th>test_name</th>
<td>logrank_test</td>
</tr>
</tbody>
</table>
</div><table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>test_statistic</th>
<th>p</th>
<th>-log2(p)</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>3.84</td>
<td>0.05</td>
<td>4.32</td>
</tr>
</tbody>
</table>
This happened only without explicit 'style' was provided, as the following worked well:
results.print_summary(style='html', decimals=5)
...
<tbody>
<tr>
<th>0</th>
<td>3.83757</td>
<td>0.05012</td>
<td>4.31858</td>
</tr>
</tbody>
results.print_summary(style='ascii', decimals=4)
<lifelines.StatisticalResult: logrank_test>
t_0 = -1
null_distribution = chi squared
degrees_of_freedom = 1
test_name = logrank_test
---
test_statistic p -log2(p)
3.8376 0.0501 4.3186
results.print_summary(style='latex', decimals=6)
\begin{tabular}{lrrr}
& test_statistic & p & -log2(p) \\
0 & 3.837570 & 0.050116 & 4.318582 \\
\end{tabular}
With the correction, the call results.print_summary(decimals=4) now results in the expected table:
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<tbody>
<tr>
<th>t_0</th>
<td>-1</td>
</tr>
<tr>
<th>null_distribution</th>
<td>chi squared</td>
</tr>
<tr>
<th>degrees_of_freedom</th>
<td>1</td>
</tr>
<tr>
<th>test_name</th>
<td>logrank_test</td>
</tr>
</tbody>
</table>
</div><table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>test_statistic</th>
<th>p</th>
<th>-log2(p)</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>3.8376</td>
<td>0.0501</td>
<td>4.3186</td>
</tr>
</tbody>
</table>
Other fitters and regression tables (Cox PH, Weibull, etc.) were not affected by this bug and continue to function as expected.
PR #1635 solves the issue.
Shoot you are right. I reverted the changes to make some edits, and didn't test thoroughly.