Running nfldb-update gives an error on database load
Hey there! Thanks for putting together this software package. Screen scraping is an undervalued datasource that is super time consuming so mad props!
When running nfdb-update from my root project dir using nfldb and a couple other dependencies (listed below is my requirements.txt file) and using Python 2.7 I get the following error and stack trace:
Traceback (most recent call last):
File "/Users/bytenel/workplace/fantasy-football/bin/nfldb-update", line 39, in <module>
nfldb.update.run(**vars(args))
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/update.py", line 535, in run
doit()
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/update.py", line 525, in doit
update_games(db, batch_size=batch_size)
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/update.py", line 397, in update_games
bulk_insert_game_data(cursor, scheduled, batch_size=batch_size)
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/update.py", line 195, in bulk_insert_game_data
do()
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/update.py", line 188, in do
nfldb.db._big_insert(cursor, table, bulk[table])
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nfldb/db.py", line 356, in _big_insert
% (table, insert_fields, values))
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/psycopg2/extras.py", line 249, in execute
return super(RealDictCursor, self).execute(query, vars)
psycopg2.IntegrityError: insert or update on table "play" violates foreign key constraint "play_pos_team_fkey"
DETAIL: Key (pos_team)=(JAX) is not present in table "team".
Is there something I'm doing wrong to update my database? My config.ini is listed below as well. I also got a stack trace earlier when things were running that is the following:
There were some errors during the download. Usually this is a
result of an HTTP request timing out, which means the
resulting "players.json" file is probably missing some data.
An appropriate solution is to re-run the script until there
are no more errors (or when the errors are problems on
NFL.com side.)
-------------------------------------------------------------------------------
Could not get player info from roster row:
<tr class="even">
<td>72</td>
<td style="text-align:left"> <a href="/player/charleslenojr./2550169/profile">Leno, Charles, Jr.</a></td>
<td>T</td>
<td>ACT</td>
<td> 6'3"</td>
<td>305</td>
<td>10/9/1991</td>
<td>4</td>
<td>Boise State</td></tr>
Exception:
Traceback (most recent call last):
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nflgame/update_players.py", line 419, in run
roster.append(meta_from_soup_row(team, row))
File "/Users/bytenel/workplace/fantasy-football/lib/python2.7/site-packages/nflgame/update_players.py", line 179, in meta_from_soup_row
last_name, first_name = map(lambda s: s.strip(), name.split(','))
ValueError: too many values to unpack
I'm not sure if that's related. It looked like 32/32 teams rosters were downloaded properly? The view that is created is just having some sort of schema issue. I also got some "Could not get GSIS id at" errors as well but I don't think that is related.
Config.ini
[pgsql]
# The time zone to use in the PostgreSQL session by default.
# You can see a list of available time zone names with the following query:
#
# select * from pg_timezone_names
#
# Here are some common ones:
#
# US/Alaska
# US/Samoa
# US/Hawaii
# US/Mountain
# US/Indiana-Starke
# US/Pacific
# US/Central
# US/Michigan
# US/Aleutian
# US/East-Indiana
# US/Arizona
# US/Eastern
# US/Pacific-New
#
# Datetimes are always stored as UTC, but this time zone setting
# can be used to change how datetimes are returned in SELECT queries.
timezone = US/Mountain
# The name of the PostgreSQL database containing NFL data.
database = nfldb
# The username and password for the database specified above.
# The user should have permission to add, delete and remove data.
# Also, the user should have permission to create/alter/drop tables,
# indexes and types.
user = nfldb
password = <redacted>
# The host of the PostgreSQL server. When using a local database, this should
# be set to "localhost".
host = localhost
# The port that the PostgreSQL server is running on. It is set to the
# default port of 5432.
port = 5432
requirements.txt
nflgame
nfldb
nflvid
JAX issue: see here and here
Too many commas in name issue: see here. There was a pull request in nflgame to resolve this that would need to be pulled to permanently fix this. The NFL tends to clean up these ", Jr." cases, but a fix on the nflgame side does seem to be in order.
Word rerunning now after that. Thanks for the tip. Any way I could submit a PR for fixing the jax issue? Should just be an alias for the team abrev's somewhere right? Also, is there something I can do to move that PR along for the update on double jr's? Might just make sense to trash records that crash things (anything malformed, etc). It's a pretty common data intake strategy.
Looks like it didn't error on load with JAX fix - I'd like to contribute a patch for that if I can just point me in the right direction