chainladder-python
chainladder-python copied to clipboard
CaseOutstanding()'s Approach 2 as Proposed by Friedland (2010)
The second issue in #159, the CaseOutstanding()
method can only hand "Approach 1", as demonstrated by Friedland, page 265 (or 271 by count).
We are missing Approach 2, as discussed and demonstrated on page 268 (or 274 by count).
Is there a way to get the values out of cl.DevelopmentConstant()
?
patterns = {12: 2, 24: 1.25, 36: 1.1, 48: 1.08, 60: 1.05, 72: 1.02}
triangle_custompattern = cl.DevelopmentConstant(
patterns=patterns, style="ldf"
).fit_transform(triangle)
triangle_custompattern.ldf_
I want to get the ldf_
or cdf_
from the pattern passed to cl.DevelopmentConstant()
, but I think the only way to get something back is to do a fit_transform()
?
You can ask for the hyperparameter back from the unfitted estimator, (triangle_custompattern.patterns
), but I'm sure that's not what you're looking for. Can you elaborate?
I actually think .patterns
works, thanks!
The more I think about this, the more I feel like I shouldn't be using .patterns
. Instead, cl.DevelopmentConstant()
should have a .ldf_
and .cdf_
attributes that take consideration of .patterns
and style
.
Is there a reason why we only have .ldf_
and .cdf_
after fit(...)
-ing a triangle object?
Does it make sense to have a new function cl.DevelopmentConstant() that returns .ldf_
and .cdf_
considering .patterns
and .style
?
In fact, from the docstrings, I think the intention was that we wanted .ldf_
and .cdf_
?
Is there a reason why we only have
.ldf_
and.cdf_
afterfit(...)
-ing a triangle object?
Yes. Per scikit-learn conventions, it is expected that parameters with trailing _ are not to be set inside the init method. All and only the public attributes set by fit have a trailing _.
Also per scikit-learn conventions, parameters accessible pre-fit are hyperparameters and should not be validated or altered at init. - source
Just within chainladder itself, the creation of ldf_
and cdf_
does in fact rely on the triangle X
, such as in this example. We supply many more ages than are necessary for the triangle being fitted. The estimator prunes the ldf_
to fit the triangle:
import chainladder as cl
triangle = cl.load_sample('ukmotor')
patterns = {12: 2, 24: 1.25, 36: 1.1, 48: 1.08, 60: 1.05, 72: 1.02, 84: 1.01, 96: 1.005, 108: 1.0025}
triangle_custompattern = cl.DevelopmentConstant(
patterns=patterns, style="ldf"
).fit_transform(triangle)
print(triangle_custompattern.ldf_)
While in this case, violating conventions seems innocuous, any violations should go through extreme scrutiny. I think bending the rules of a framework can quickly lead to instability in software.
So does that mean that the docstring for cl.DevelopmentConstant()
is not correct? There should be no attributes for .ldf_
and cdf_
since "parameters accessible pre-fit are hyperparameters and should not be validated or altered at init."
Looking at scikit-learn, such as this example, they consider hyperparameters as class parameters and fitted parameters as class attributes.
The more I work on this, the more I feel like this method should be coded as a pattern instead of an IBNR model.
Let me try to summarize the two methods case outstanding methods.
The first CaseOutstanding()
method takes in two triangles, paid and reported, and goes through some algorithms to calculate a case to prior case LDFs, paid to prior case LDFs, and it develop the case triangle as well as the adjusted paid triangle to get to ultimates. This model then then returns an implied LDF for paid, which needs to be used in conjunction of the Chainladder()
method to get to the ultimate.
The second case outstanding method takes in two arrays of patterns (paid and reported) and then return a third pattern, which needs to be used in conjunction with the Chainladder()
model to get to the ultimate.
If the first model is implemented as a pattern adjustment, why shouldn't the second? In fact, I think the first method should be implemented as an IBNR model, and the second should be implemented as a pattern method.
There is a technical problem though. If the second method is implemented as a pattern, the return pattern used with chainladder will estimate everything correctly except the first origin period. This is because I believe chainladder assumes the last LDF to ultimate to be 1.000; if it is not 1.000, a TailConstant()
is expected to be used instead of LDF/CDF to be used?