chainladder-python icon indicating copy to clipboard operation
chainladder-python copied to clipboard

CaseOutstanding()'s Approach 2 as Proposed by Friedland (2010)

Open kennethshsu opened this issue 2 years ago • 9 comments

The second issue in #159, the CaseOutstanding() method can only hand "Approach 1", as demonstrated by Friedland, page 265 (or 271 by count).

We are missing Approach 2, as discussed and demonstrated on page 268 (or 274 by count).

kennethshsu avatar Mar 22 '22 01:03 kennethshsu

Is there a way to get the values out of cl.DevelopmentConstant()?

patterns = {12: 2, 24: 1.25, 36: 1.1, 48: 1.08, 60: 1.05, 72: 1.02}
triangle_custompattern = cl.DevelopmentConstant(
    patterns=patterns, style="ldf"
).fit_transform(triangle)
triangle_custompattern.ldf_

I want to get the ldf_ or cdf_ from the pattern passed to cl.DevelopmentConstant(), but I think the only way to get something back is to do a fit_transform()?

kennethshsu avatar May 04 '22 04:05 kennethshsu

You can ask for the hyperparameter back from the unfitted estimator, (triangle_custompattern.patterns), but I'm sure that's not what you're looking for. Can you elaborate?

jbogaardt avatar May 04 '22 11:05 jbogaardt

I actually think .patterns works, thanks!

kennethshsu avatar May 04 '22 23:05 kennethshsu

The more I think about this, the more I feel like I shouldn't be using .patterns. Instead, cl.DevelopmentConstant() should have a .ldf_ and .cdf_ attributes that take consideration of .patterns and style.

Is there a reason why we only have .ldf_ and .cdf_ after fit(...)-ing a triangle object?

Does it make sense to have a new function cl.DevelopmentConstant() that returns .ldf_ and .cdf_ considering .patterns and .style?

kennethshsu avatar May 07 '22 18:05 kennethshsu

In fact, from the docstrings, I think the intention was that we wanted .ldf_ and .cdf_?

kennethshsu avatar May 07 '22 19:05 kennethshsu

Is there a reason why we only have .ldf_ and .cdf_ after fit(...)-ing a triangle object?

Yes. Per scikit-learn conventions, it is expected that parameters with trailing _ are not to be set inside the init method. All and only the public attributes set by fit have a trailing _.

Also per scikit-learn conventions, parameters accessible pre-fit are hyperparameters and should not be validated or altered at init. - source

Just within chainladder itself, the creation of ldf_ and cdf_ does in fact rely on the triangle X, such as in this example. We supply many more ages than are necessary for the triangle being fitted. The estimator prunes the ldf_ to fit the triangle:

import chainladder as cl
triangle = cl.load_sample('ukmotor')
patterns = {12: 2, 24: 1.25, 36: 1.1, 48: 1.08, 60: 1.05, 72: 1.02, 84: 1.01, 96: 1.005, 108: 1.0025}
triangle_custompattern = cl.DevelopmentConstant(
    patterns=patterns, style="ldf"
).fit_transform(triangle)

print(triangle_custompattern.ldf_)

While in this case, violating conventions seems innocuous, any violations should go through extreme scrutiny. I think bending the rules of a framework can quickly lead to instability in software.

jbogaardt avatar May 07 '22 20:05 jbogaardt

So does that mean that the docstring for cl.DevelopmentConstant() is not correct? There should be no attributes for .ldf_ and cdf_ since "parameters accessible pre-fit are hyperparameters and should not be validated or altered at init."

kennethshsu avatar May 07 '22 21:05 kennethshsu

Looking at scikit-learn, such as this example, they consider hyperparameters as class parameters and fitted parameters as class attributes.

jbogaardt avatar May 07 '22 21:05 jbogaardt

The more I work on this, the more I feel like this method should be coded as a pattern instead of an IBNR model.

Let me try to summarize the two methods case outstanding methods.

The first CaseOutstanding() method takes in two triangles, paid and reported, and goes through some algorithms to calculate a case to prior case LDFs, paid to prior case LDFs, and it develop the case triangle as well as the adjusted paid triangle to get to ultimates. This model then then returns an implied LDF for paid, which needs to be used in conjunction of the Chainladder() method to get to the ultimate.

The second case outstanding method takes in two arrays of patterns (paid and reported) and then return a third pattern, which needs to be used in conjunction with the Chainladder() model to get to the ultimate.

If the first model is implemented as a pattern adjustment, why shouldn't the second? In fact, I think the first method should be implemented as an IBNR model, and the second should be implemented as a pattern method.

There is a technical problem though. If the second method is implemented as a pattern, the return pattern used with chainladder will estimate everything correctly except the first origin period. This is because I believe chainladder assumes the last LDF to ultimate to be 1.000; if it is not 1.000, a TailConstant() is expected to be used instead of LDF/CDF to be used?

kennethshsu avatar May 11 '22 03:05 kennethshsu