[FBref] team_match_stats for teams with slash "/" in the name results in FileNotFoundError
Describe the bug Unable to use team_match_stats for teams with slash in the name, like Bodø/Glimt.
It tries to create file matchlogs_Bodø/Glimt_2022_schedule.html, which resolves incorrectly due to slash in the name.
Affected scrapers This affects the following scrapers:
- [ ] ClubElo
- [ ] ESPN
- [x] FBref
- [ ] FiveThirtyEight
- [ ] FotMob
- [ ] Match History
- [ ] SoFIFA
- [ ] Understat
- [ ] WhoScored
Code example
A minimal code example that fails. Use no_cache=True to make sure an invalid cached file does not cause the bug and make sure you have the latest version of soccerdata installed.
import soccerdata as sd
fbref = sd.FBref(leagues="SWE-Allsvenskan", seasons=[2022,2023], no_cache=True)
fbref.read_team_match_stats(stat_type="schedule", opponent_stats=False, team="Bodø/Glimt", force_cache=True)
Error message
Error while scraping https://fbref.com/en/squads/d86248bd/2022/matchlogs/all_comps/schedule. Retrying... (attempt 2 of 5). _common.py:568│
│ Traceback (most recent call last): │
│ File "/Users/user/.venv/lib/python3.12/site-packages/soccerdata/_common.py", line 564, in _download_and_save │
│ with filepath.open(mode="wb") as fh: │
│ ^^^^^^^^^^^^^^^^^^^^^^^^ │
│ File "/usr/local/Cellar/[email protected]/3.12.3/Frameworks/Python.framework/Versions/3.12/lib/python3.12/pathlib.py", line 1013, in open │
│ return io.open(self, mode, buffering, encoding, errors, newline) │
│ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ │
│ FileNotFoundError: [Errno 2] No such file or directory: │
│ '/Users/user/soccerdata/data/FBref/historic/matchlogs_Bodø/Glimt_2022_schedule.html'
Additional context Note, line number in _common.py with the error might differ, as I did minor changes in the code.
Contributor Action Plan
- [ ] I can fix this issue and will submit a pull request.
- [x] I’m unsure how to fix this, but I'm willing to work on it with guidance.
- [ ] I’m not able to fix this issue.
Fixed it by changing in fbref.py
filepath = self.data_dir / filemask.format(team, skey, stat_type)
to
filepath = self.data_dir / filemask.format(team.replace('/',''), skey, stat_type)
Not sure if it breaks anything, though.
No, it won't break anything. A more generic solution would be to use something like Django's slugify() function.