Fix the Lahman Database Scraping
This PR redirects the Lahman database from a 404 github link to the dropbox site that is found on baseball1.com.
In order to extract the data, py7zr was added to the requirements.
couldl this be addressed by moving the link to soemthing in https://github.com/chadwickbureau/retrosheet?
couldl this be addressed by moving the link to soemthing in https://github.com/chadwickbureau/retrosheet?
Looks like there may be some overlapping files, but nothing that mimics Lahman's db.
what do y'all think about extracting the data and posting it to a repo in github? maybe even embedded in pybaseball?
what do y'all think about extracting the data and posting it to a repo in github? maybe even embedded in pybaseball?
I have no issue with that (I assume theres no licensing issues with that). I'm happy to add a folder here or put them on my own github.
As of today, the Lahman DB has been donated to the Society for American Baseball Research (SABR). SABR is now responsible for hosting and maintaining the dataset.