bonsai
bonsai copied to clipboard
added support for case_weights
I noticed there wasn't much progress on integrating case_weights into lightgbm through bonsai so I've taken a crack at adding it. This is in reference to 65.
Thanks for the review @simonpcouch. Note, I am also working on a method that allows some ability to customize validation data beyond just specifying a proportion of training data. I find this quite necessary for my work (since my data are not entirely independent - hence needing case weights as well), but I don't know if its robust enough to include in Bonsai proper.
Thanks again for your work here. Given the lack of movement on this PR, I'm going to go ahead and close. :)
I have found a need for this feature recently, so I may try to resurrect this pull request and finish cleaning it up sometime this week. Anything I should know besides your comments above, @simonpcouch ?
That'd be wonderful, @joranE! Conceptually, the implementation here looks solid—following the current review comments would put this PR in a good spot.
Sorry, this fell off my radar. I can finish addressing the comments next week, if that works for you.
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Simon P. Couch @.> Sent: Monday, April 8, 2024 9:31:10 AM To: tidymodels/bonsai @.> Cc: Patrick Schaefer @.>; Author @.> Subject: Re: [tidymodels/bonsai] added support for case_weights (PR #72)
That'd be wonderful, @joranEhttps://github.com/joranE! Conceptually, the implementation here looks solid—following the current review comments would put this PR in a good spot.
— Reply to this email directly, view it on GitHubhttps://github.com/tidymodels/bonsai/pull/72#issuecomment-2042767586, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYA7SXU5VRFVBVDIZ322P3Y4KLZ5AVCNFSM6AAAAABAUFJUD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSG43DONJYGY. You are receiving this because you authored the thread.Message ID: @.***>
@p-schaefer Sounds good, I'll let you tackle it! If you find yourself pulled away from it again for any reason, just let me know.
I've fixed the spacing issues, and dealt with the intermediate file from testthat. Should I submit a new PR?
You're welcome to push to the same branch! I'll reopen this one.
Sorry, forgot to rerun the tests and checks. Looks good to me now, let me know if there is anything else.
Thanks for closing the loop on this @p-schaefer! Now I just need to figure out if there's an easy way to tweak this interface to allow for the use of the linear_tree
parameter for piecewise linear trees, which "naturally" needs to be passed to lgb.Dataset()
as well, instead of lgb.train()
. 🙄
No problem. It might be worth opening a new issue to discuss how to deal with linear_tree
(if you want some help). I think we can probably come up with a suitably tidy workflow for dealing with arguments that should be passed to lgb.Dataset()
vs lgb.train()
. I've been thinking of playing around with linear_tree
as well.
I added a new issue here.
Test failures are unrelated, feel free to ignore. :)
@simonpcouch Was there anything else you need from me?
Nice work, @p-schaefer! :)