formulaic icon indicating copy to clipboard operation
formulaic copied to clipboard

Output column names for sparse output

Open seanv507 opened this issue 2 years ago • 1 comments

The dataframe output of a model matrix has column names [Intercept, b[T.a],...]

when one specifies output = sparse, the column identifiers are not available... ( the output is a regular scipy sparse matrix)

This would be needed when eg identifying coefficient names.

seanv507 avatar Jun 13 '22 13:06 seanv507

Hi! The column names are available on the (wrapped) sparse matrix using: <output>.model_spec.feature_names. This isn't thoroughly documented mainly because I need to review the API, which I will do soon. I'll leave this here as a reminder to add documentation about this!

matthewwardrop avatar Jun 20 '22 17:06 matthewwardrop

@matthewwardrop Hi, I would be keen to contribute to formulaic. Would this be a good first issue or is there something else you can suggest?

adamkells avatar Nov 25 '22 10:11 adamkells

Hi @adamkells ! Thanks for your interest and willingness to contribute!

This particular issue has been resolved, though not yet documented; so not perhaps the best issue to contribute to.

Hmmm... Perhaps your best bet is to contribute a new transformation? Such work is orthogonal to framework improvements. Perhaps the missing cr, ce or te basis transforms? I've been meaning to get around to implementing them, but haven't had much of a chance yet.

matthewwardrop avatar Nov 27 '22 23:11 matthewwardrop

Hi @matthewwardrop, sounds good! I'll take a closer look at the code and open a PR later to discuss this further.

adamkells avatar Nov 28 '22 15:11 adamkells