Passing multiple inputs (scalar and columns) to the parametrized_input decorator
Is your feature request related to a problem? Please describe. I would like to pass multiple inputs (scalar and columns) to the parametrized_input decorator.
Describe the solution you'd like I would like to create a series of rules for different columns with all parameters in one place for easier management and readability. Example of a rule is whether any values in the column has exceeded a threshold value, however this could be extended to include any numeric comparisons such as <,>, = or combinations thereof e.g. between value_1 & value_2 (inclusive).
import pandas as pd
from hamilton.function_modifiers import parametrized_input
INV_PARAMS = {
#input var (# output var, # threshold_value , # description of new outputs)
'inventory', ('inventory_geq_1000', 1000, 'inventory greater than or equal to 1000'),
'inventory', ('inventory_leq_215', 215, 'inventory less than or equal to 215'),
}
@parametrized_input(parameter='inventory', assigned_inputs=INV_PARAMS )
def inventory_geq(inventory: pd.Series) -> pd.Series:
pass
Thanks so much!
So, I think this makes sense. The question is whether you want to do this just on the inventory column, or on a bunch of different columns (E.G. different types of inventory). If you do this on just the inventory column, you could do something like this:
INV_PARAMS = {
('inventory_geq_1000', 'inventory greater than or equal to 1000') : 1000,
('inventory_leq_215', 'inventory less than or equal to 215'): 215,
}
@parametrized(parameter=threshold, assigned_output=INV_PARAMS)
def inventory_geq(inventory: pd.Series, threshold: int):
return inventory[inventory > threshold)
This works if you have one column inventory that you want to filter. If you have multiple columns, then you could create the above for each one (a little verbose, but there's something nice about being specific/making things readable), or we could look at combining @parametrized with @parametrized_input as you suggested. Am I understanding you correctly?
Thanks for clarifying my example. I agree that it is a bit verbose but definitely gets the job done.
Can I naturally extend this example to use 2 threshold values by passing an array of threshold values instead of just one int? This would be good for identifying column values between the two threshold values.
On Wed, Nov 17, 2021 at 12:12 AM Elijah ben Izzy @.***> wrote:
So, I think this makes sense. The question is whether you want to do this just on the inventory column, or on a bunch of different columns (E.G. different types of inventory). If you do this on just the inventory column, you could do something like this:
INV_PARAMS = { ('inventory_geq_1000', 'inventory greater than or equal to 1000') : 1000, ('inventory_leq_215', 'inventory less than or equal to 215'): 215, } @parametrized(parameter=threshold, assigned_output=INV_PARAMS)def inventory_geq(inventory: pd.Series, threshold: int): return inventory[inventory > threshold)
This works if you have one column inventory that you want to filter. If you have multiple columns, then you could create the above for each one (a little verbose, but there's something nice about being specific/making things readable), or we could look at combining @parametrized with @parametrized_input as you suggested. Does this make sense?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stitchfix/hamilton/issues/25#issuecomment-971197943, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB46N5MJDZJVFDFBR4IMS53UMM2UDANCNFSM5IEUDUOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
-- Regards,
Lorre Atlan, PhD
Thanks for clarifying my example. I agree that it is a bit verbose but definitely gets the job done. Can I naturally extend this example to use 2 threshold values by passing an array of threshold values instead of just one int? This would be good for identifying column values between the two threshold values. On Wed, Nov 17, 2021 at 12:12 AM Elijah ben Izzy @.***> wrote: So, I think this makes sense. The question is whether you want to do this just on the inventory column, or on a bunch of different columns (E.G. different types of inventory). If you do this on just the inventory column, you could do something like this: INV_PARAMS = { ('inventory_geq_1000', 'inventory greater than or equal to 1000') : 1000, ('inventory_leq_215', 'inventory less than or equal to 215'): 215, } @parametrized(parameter=threshold, assigned_output=INV_PARAMS)def inventory_geq(inventory: pd.Series, threshold: int): return inventory[inventory > threshold) This works if you have one column inventory that you want to filter. If you have multiple columns, then you could create the above for each one (a little verbose, but there's something nice about being specific/making things readable), or we could look at combining @parametrized with @parametrized_input as you suggested. Does this make sense? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB46N5MJDZJVFDFBR4IMS53UMM2UDANCNFSM5IEUDUOA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. -- Regards, Lorre Atlan, PhD
Yep you can! We're not opinionated on the datatypes at all actually, except in the cases in which we've made it easier to use pandas dataframes. Config items can be anything.
@elijahbenizzy should this issue be closed? Or an updated added?
@skrawcz yeah, I think so. @latlan1, would love to get a sense of how the new changes to @parameterize woudl help you: https://github.com/stitchfix/hamilton/blob/main/decorators.md#parameterize.
OK, so I"m pretty sure this is now solved! @parameterize is far more sophisticated. @latlan1 let me know if there's something that hasn't been covered and we can reopen, otherwise I'm closing this!
Looks good-thank you!
On Sat, Oct 29, 2022 at 12:38 PM Elijah ben Izzy @.***> wrote:
Closed #25 https://github.com/stitchfix/hamilton/issues/25 as completed.
— Reply to this email directly, view it on GitHub https://github.com/stitchfix/hamilton/issues/25#event-7697445432, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB46N5KORTND63UWGPKAIVTWFVOKRANCNFSM5IEUDUOA . You are receiving this because you were mentioned.Message ID: @.***>
-- Regards,
Lorre Atlan, PhD