pydfs-lineup-optimizer icon indicating copy to clipboard operation
pydfs-lineup-optimizer copied to clipboard

Conditional Stacking/Group Rule

Open asardinha opened this issue 5 years ago • 5 comments

I've though a lot about correlation lately, and how it can help in building lineups.

With that, I was wondering if anyone could assist in the development of additional rules and "conditionals" for the optimizer.

Here are some features that I had in mind:

Conditional Stacking IF player X appears in a lineup, INCLUDE a minimum 1, maximum 3 of players A,B,C (IF Terry McLaurin is in the lineup, play AT LEAST 1 player(s) of Amari Cooper, CeeDee Lamb, Dalton Schultz, using a maximum of 3).

or

(If Anthony Davis is in a lineup, play AT LEAST 1 player from a Kemba Walker/Jayson Tatum group)

Additional Stacking Rule

If I remember correctly, several optimizers on the market have rules like "Limit player to 1 from teams, unless stacked with a QB." I would assume that an additional rule will need to be added to the optimizer's features?

Also, specifically in NFL, having a second stack is sometimes beneficial. Could we add a stack rule to include sometime like

optimizer.add_stack(PositionsStack(['QB', ('WR', 'TE')])) and then optimizer.add_stack(PositionsStack(['WR', ('RB', 'TE')])) from opposing teams (independent of the first called stack).

Thanks again everyone, I hope this can possibly start conversation.

asardinha avatar Nov 25 '20 13:11 asardinha

I would recommend doing this through projections, rather than optimizer rules. There is no requirement that you generate lineups from one set of projections, so you can use a variety of projection scenarios to generate multiple lineups.

In your example above, you would generate a set of projections assuming a QB has a good game, and also the likelihood that the opposing team's output will be increased by the game script. Then optimize on those projections. You can do this for any number of scenarios.

import numpy as np
import pandas as pd

lineups = []
o = get_optimizer(...)

# assume projections csv has the following columns:
# player, pos, team, salary, fppg
df = pd.read_csv('myprojections.csv')

# go through a variety of scenarios
# scenario 1: in LAC vs. NYJ, the LAC QB plays well
# also assume WR & opp WR play better as well

# specify a range of assumptions of how the players
# could deviate from your median projections
qb_low_end = 1.1
qb_high_end = 1.6
wr_low_end = 1.1
wr_high_end = 1.6
opp_wr_low_end = 1
opp_wr_high_end = 1.4

# randomly draw the multipliers for the relevant groups
qb_multiplier = np.random.uniform(qb_low_end, qb_high_end)
qb_wr_multipliers = np.random.uniform(wr_low_end, wr_high_end, size=3)
qb_opp_wr_multipliers = np.random.uniform(opp_wr_low_end, opp_wr_high_end, size=3)

# apply multipliers to projections
scen1 = df.copy()
scen1.loc[scen1.player == 'Justin Herbert', 'fppg'] = scen1.loc[scen1.player == 'Justin Herbert', 'fppg'] * qb_multiplier
... apply to WR / opp WR ...

# now create player objects from dataframe
# row to player is a function you write to create Player objects
players = scen1.apply(row_to_player, axis=1)
o.load_players(players)

for lineup in o.optimize(n_lineups):
    lineups.append(lineup)

sansbacon avatar Nov 25 '20 23:11 sansbacon

@sansbacon This is interesting. Could it be applied in the same way if the csv uses percentile projections instead of deviations? Could these be added to the pandas data frame?

As an extra step, it might be interesting to incorporate correlation coefficients for each player? For example if Matthew Stafford has a good game, typically either of Kenny Golladay or Marvin Jones has a good game (-.453 correlation to one another)

asardinha avatar Nov 26 '20 17:11 asardinha

Yes, I think correlation coefficients would be the way to estimate the range of outcomes for the stacked players. So you would start with an optimistic projection for a game or a QB, and then base your projections for the remaining players on that original optimistic projection, which will tend to push players from that game into optimal lineups for that projection set.

The key insight here is that it makes more sense to have different projection sets based on scenarios than it does to make the optimizer insanely complex to account for a variety of rules. Aggregate lineups over many different sets of projections, which will keep the optimizer simple.

sansbacon avatar Nov 30 '20 22:11 sansbacon

@sansbacon If I'm understanding correctly, this would get a bit tedious to write out scenarios for different players/teams, right? Is there a way to generalize this with the randomness in the optimizer?

For instance, if you randomize the projections while running the optimizer, Justin Herbert hitting a x percentile outcome would affect Keenan Allen or Mike Williams' projections by y amount.

cdbmgrdn avatar Dec 07 '20 04:12 cdbmgrdn

One way to do it on a general basis is as follows:

  • Draw team points scored from a distribution based on historical results given game total and spread
  • Draw QB points scored from a distribution based on historical results given team points and team/qb data
  • Draw other player points scored from a distribution based on historical results given team points, QB points scored
  • Have a corrective mechanism so the total of other player points scored is not too far out of whack with what the QB scored

If you don't want to get that far in the weeds, then I would construct a table of historical results that has columns for fantasy points for QB, WR1, WR2, RB1, TE1, etc. A simple way to determine who is WR1, etc. is to base it on either DFS salary or a rolling average of fantasy points - the WR1 has the highest WR salary or rolling average on the team, and so forth. I think you also would want to round fantasy points to the nearest integer. You could then calculate the probability of other-position fantasy points given QB fantasy points and use that for drawing values.

sansbacon avatar Dec 07 '20 16:12 sansbacon