nflgame icon indicating copy to clipboard operation
nflgame copied to clipboard

iterating over game-drives-plays-players vs. game.players

Open patmun opened this issue 8 years ago • 3 comments

Thank's for this great package!

It may not be an issue but just me not understanding.

I want to get all fumbles recovered by defense duging a game (defense_frec stat).

game = nflgame.games(2016, week=1, away='MIN')[0] print game MIN (25) at TEN (16)

There was actually 2 fumbles recovered by the Vikings in this game.

trying:

for dd in game.drives:
    for pp in dd.plays:
        for pl in pp.players:
            if 'defense_frec' in pl.stats:
                print pl, pl.team, pl.stats

D.Hunter MIN OrderedDict([('defense_tds', 1), ('defense_frec_tds', 1), ('defense_frec_yds', 24),    ('defense_frec', 1)])
A.Sendejo MIN OrderedDict([('defense_frec_yds', 2), ('defense_frec', 1)])

I get the expected result.

But if I iterate over game.players directly,

for pp in game.players:
    if 'defense_frec' in pl.stats:
                print pl, pl.team, pl.stats

then, I get nothing (I actually get some stats but the ones I'm expecting aren't there). What am I missing?

patmun avatar Sep 13 '16 17:09 patmun

You're not doing anything wrong per-say, just a misunderstanding about what type of statistic a defense_frec is.

While it does appear as a stat type within a play (your first code block demonstrates this), when you ran for pp in game.players you had nflgame do some aggregating of all of the plays and spit out a different set of stats recorded in the game. Whereas defense_frec is a valid stat type in a play, the aggregated collection of those recovered fumbles is captured under fumbles_trcv (I honestly don't know how it determined which stats are used nor how those play stats get mapped to their aggregate counterparts.).

import nflgame

games = nflgame.games(year=2016, week=1, kind='REG', away='MIN')
for game in games:
    for pp in game.players:
        # if 'fumbles_trcv' in pp.stats:  # Maybe everyone that fumbles has `fumbles_trcv` initialized to 0?
        if pp.stats.get('fumbles_trcv', 0):
                    print pp, pp.team, pp.stats
A.Sendejo MIN OrderedDict([(u'fumbles_trcv', 1), (u'fumbles_tot', 0), (u'fumbles_rcv', 0), (u'fumbles_yds', 2), (u'fumbles_lost', 0), (u'defense_ffum', 0), (u'defense_tkl', 4), (u'defense_int', 0), (u'defense_ast', 0), (u'defense_sk', 0)])
D.Hunter MIN OrderedDict([(u'fumbles_trcv', 1), (u'fumbles_tot', 0), (u'fumbles_rcv', 0), (u'fumbles_yds', 24), (u'fumbles_lost', 0), (u'defense_ffum', 0), (u'defense_tkl', 3), (u'defense_int', 0), (u'defense_ast', 0), (u'defense_sk', 1)])

But - and I hope you've made it this far! - you don't need to go through the cumbersome game > drive > play > player hierarchy to access play statistics. Forget all that. If you want all of the plays where a fumble was recovered, then you can do something much simpler:

import nflgame

games = nflgame.games(year=2016, week=1, kind='REG', away='MIN')
plays = nflgame.combine_plays(games)
for play in plays.filter(defense_frec=True):
    print play

    for player in play.players:
        print '\t', player, player.stats
    print '-'*79
(TEN, TEN 25, Q4, 1 and 10) (11:11) (Shotgun) M.Mariota FUMBLES (Aborted) at TEN 22, RECOVERED by MIN-D.Hunter at TEN 24. D.Hunter for 24 yards, TOUCHDOWN.
    M.Mariota OrderedDict([('rushing_att', 1), ('rushing_yds', 0), ('fumbles_rec_yds', -1), ('fumbles_tot', 1), ('fumbles_notforced', 1), ('fumbles_lost', 1)])
    D.Hunter OrderedDict([('defense_tds', 1), ('defense_frec_tds', 1), ('defense_frec_yds', 24), ('defense_frec', 1)])
-------------------------------------------------------------------------------
(TEN, TEN 42, Q4, 2 and 8) (9:48) (Shotgun) D.Murray up the middle to TEN 43 for 1 yard (L.Joseph). FUMBLES (L.Joseph), RECOVERED by MIN-A.Sendejo at TEN 48. A.Sendejo to TEN 46 for 2 yards (T.Lewan).
    A.Sendejo OrderedDict([('defense_frec_yds', 2), ('defense_frec', 1)])
    T.Lewan OrderedDict([('defense_tkl', 1)])
    D.Murray OrderedDict([('rushing_att', 1), ('rushing_yds', 1), ('fumbles_tot', 1), ('fumbles_forced', 1), ('fumbles_lost', 1)])
    L.Joseph OrderedDict([('defense_tkl', 1), ('defense_ffum', 1)])
-------------------------------------------------------------------------------

ochawkeye avatar Sep 13 '16 17:09 ochawkeye

Thanks for the answer!

In order to wrap my head around this, I took a look at the json file for this game. It seems that there are two types of statistics in there. One set is for each of the players in a team and seems to correspond to the players' aggregate stats for the whole game. The fumbles_trcv comes from this set of stats.

The other set of stat is at the level of each play and have a statId which seems to be mapped to python attributes through the statmap.py. The statId for the players who recovered fumbles in their respective plays is 59, which is mapped to defense_frec in statmap.py.

So game.players iterates over the first set of stats, explaining with fumbles_trcv is found in there but not defense_frec. On the other hand, play.players returns the second set of stats, which contains defense_frec. Hope it makes sense?

patmun avatar Sep 13 '16 20:09 patmun

Yeah, that does make sense. I was mistaking the aggregated data as something nflgame was assembling. In reality, it is already in the JSON and nflgame is just pulling it out.

ochawkeye avatar Sep 13 '16 20:09 ochawkeye